首页 > 最新文献

Journal of Machine Learning Research最新文献

英文 中文
A Brief Survey on the Approximation Theory for Sequence Modelling 序列建模的近似理论综述
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.221221
Haotian Jiang, Qianxiao Li, Zhong Li null, Shida Wang
{"title":"A Brief Survey on the Approximation Theory for Sequence Modelling","authors":"Haotian Jiang, Qianxiao Li, Zhong Li null, Shida Wang","doi":"10.4208/jml.221221","DOIUrl":"https://doi.org/10.4208/jml.221221","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135381134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement Learning with Function Approximation: From Linear to Nonlinear 函数逼近的强化学习:从线性到非线性
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.230105
Jihao Long and Jiequn Han
{"title":"Reinforcement Learning with Function Approximation: From Linear to Nonlinear","authors":"Jihao Long and Jiequn Han","doi":"10.4208/jml.230105","DOIUrl":"https://doi.org/10.4208/jml.230105","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135887743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why Self-Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries 为什么自我关注是序列对序列问题的自然表现?从对称角度看问题
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.221206
Chao Ma and Lexing Ying null
{"title":"Why Self-Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries","authors":"Chao Ma and Lexing Ying null","doi":"10.4208/jml.221206","DOIUrl":"https://doi.org/10.4208/jml.221206","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135142632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selective inference for k-means clustering. k-means 聚类的选择性推理。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-05-01
Yiqun T Chen, Daniela M Witten

We consider the problem of testing for a difference in means between clusters of observations identified via k-means clustering. In this setting, classical hypothesis tests lead to an inflated Type I error rate. In recent work, Gao et al. (2022) considered a related problem in the context of hierarchical clustering. Unfortunately, their solution is highly-tailored to the context of hierarchical clustering, and thus cannot be applied in the setting of k-means clustering. In this paper, we propose a p-value that conditions on all of the intermediate clustering assignments in the k-means algorithm. We show that the p-value controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using k-means clustering in finite samples, and can be efficiently computed. We apply our proposal on hand-written digits data and on single-cell RNA-sequencing data.

我们考虑的问题是检验通过 k-means 聚类确定的观测数据聚类之间的均值差异。在这种情况下,经典的假设检验会导致 I 类错误率上升。在最近的工作中,Gao 等人(2022 年)考虑了分层聚类背景下的相关问题。遗憾的是,他们的解决方案与分层聚类的背景高度契合,因此无法应用于 k-means 聚类。在本文中,我们提出了一个 p 值,它是 k-means 算法中所有中间聚类分配的条件。我们证明,该 p 值可以控制在有限样本中使用 k-means 聚类对一对聚类的均值差异进行检验时的选择性 I 类错误,并且可以高效计算。我们将我们的建议应用于手写数字数据和单细胞 RNA 序列数据。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">Selective inference for <ns0:math><ns0:mi>k</ns0:mi></ns0:math>-means clustering.","authors":"Yiqun T Chen, Daniela M Witten","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of testing for a difference in means between clusters of observations identified via <math><mi>k</mi></math>-means clustering. In this setting, classical hypothesis tests lead to an inflated Type I error rate. In recent work, Gao et al. (2022) considered a related problem in the context of hierarchical clustering. Unfortunately, their solution is highly-tailored to the context of hierarchical clustering, and thus cannot be applied in the setting of <math><mi>k</mi></math>-means clustering. In this paper, we propose a p-value that conditions on all of the intermediate clustering assignments in the <math><mi>k</mi></math>-means algorithm. We show that the p-value controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using <math><mi>k</mi></math>-means clustering in finite samples, and can be efficiently computed. We apply our proposal on hand-written digits data and on single-cell RNA-sequencing data.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10805457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139543526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering. 基于贝叶斯模型的聚类中的维数诅咒。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-04-01
Noirrit Kiran Chandra, Antonio Canale, David B Dunson

Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.

贝叶斯混合模型广泛用于高维数据的聚类,并对其进行适当的不确定性量化。然而,随着观察的维度增加,后验推理往往倾向于支持太多或太少的集群。本文通过研究固定样本量和增加数据维数的非标准设置下的随机后验分割来解释这种行为。我们提供了一些条件,在这些条件下,随着维数的增长,有限样本后验倾向于将每个观测值分配到不同的聚类,或者将所有观测值分配到同一聚类。有趣的是,这些条件并不依赖于聚类先验的选择,只要所有可能的观察划分到聚类中都具有正先验概率,并且与真实的数据生成模型无关。然后,我们提出了一类用于贝叶斯聚类(Lamb)的潜在混合在一组低维潜在变量上引起对观测数据的划分。该模型适用于可扩展的后验推理,并且在温和的假设条件下可以避免高维的缺陷。该方法在仿真研究和基于scRNAseq的细胞类型推断中具有良好的性能。
{"title":"Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering.","authors":"Noirrit Kiran Chandra, Antonio Canale, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11999651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144054439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNN-Attention Based Deep Learning for Solving Inverse Boundary Problems in Nonlinear Marshak Waves 基于rnn -注意力的深度学习求解非线性马沙克波反边界问题
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-04-01 DOI: 10.4208/jml.221209
Di Zhao, Weiming Li, Wengu Chen, Peng Song, and Han Wang null
. Radiative transfer, described by the radiative transfer equation (RTE), is one of the dominant energy exchange processes in the inertial confinement fusion (ICF) experiments. The Marshak wave problem is an important benchmark for time-dependent RTE. In this work, we present a neural network architecture termed RNN-attention deep learning (RADL) as a surrogate model to solve the inverse boundary problem of the nonlinear Marshak wave in a data-driven fashion. We train the surrogate model by numerical simulation data of the forward problem, and then solve the inverse problem by minimizing the distance between the target solution and the surrogate predicted solution concerning the boundary condition. This minimization is made efficient because the surrogate model by-passes the expensive numerical solution, and the model is differentiable so the gradient-based optimization algorithms are adopted. The effectiveness of our approach is demonstrated by solving the inverse boundary problems of the Marshak wave benchmark in two case studies: where the transport process is modeled by RTE and where it is modeled by its nonlinear diffusion approximation (DA). Last but not least, the importance of using both the RNN and the factor-attention blocks in the RADL model is illustrated, and the data efficiency of our model is investigated in this work.
。辐射传递是惯性约束聚变(ICF)实验中主要的能量交换过程之一,用辐射传递方程(RTE)来描述。马沙克波问题是时变RTE的一个重要基准。在这项工作中,我们提出了一种称为rnn -注意力深度学习(RADL)的神经网络架构作为代理模型,以数据驱动的方式解决非线性马沙克波的逆边界问题。我们利用正演问题的数值模拟数据训练代理模型,然后在边界条件下通过最小化目标解与代理预测解之间的距离来求解逆问题。由于替代模型绕过了昂贵的数值解,并且模型是可微的,因此采用了基于梯度的优化算法,从而使这种最小化变得高效。通过在两个案例研究中解决马沙克波基准的逆边界问题,我们的方法的有效性得到了证明:其中输运过程是由RTE建模的,而它是由其非线性扩散近似(DA)建模的。最后,说明了在RADL模型中同时使用RNN和因子注意块的重要性,并对我们的模型的数据效率进行了研究。
{"title":"RNN-Attention Based Deep Learning for Solving Inverse Boundary Problems in Nonlinear Marshak Waves","authors":"Di Zhao, Weiming Li, Wengu Chen, Peng Song, and Han Wang null","doi":"10.4208/jml.221209","DOIUrl":"https://doi.org/10.4208/jml.221209","url":null,"abstract":". Radiative transfer, described by the radiative transfer equation (RTE), is one of the dominant energy exchange processes in the inertial confinement fusion (ICF) experiments. The Marshak wave problem is an important benchmark for time-dependent RTE. In this work, we present a neural network architecture termed RNN-attention deep learning (RADL) as a surrogate model to solve the inverse boundary problem of the nonlinear Marshak wave in a data-driven fashion. We train the surrogate model by numerical simulation data of the forward problem, and then solve the inverse problem by minimizing the distance between the target solution and the surrogate predicted solution concerning the boundary condition. This minimization is made efficient because the surrogate model by-passes the expensive numerical solution, and the model is differentiable so the gradient-based optimization algorithms are adopted. The effectiveness of our approach is demonstrated by solving the inverse boundary problems of the Marshak wave benchmark in two case studies: where the transport process is modeled by RTE and where it is modeled by its nonlinear diffusion approximation (DA). Last but not least, the importance of using both the RNN and the factor-attention blocks in the RADL model is illustrated, and the data efficiency of our model is investigated in this work.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"75 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74640699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference for Gaussian Processes with Matérn Covariogram on Compact Riemannian Manifolds. 紧凑黎曼曼形上具有马特恩协方差的高斯过程推理
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-03-01
Didong Li, Wenpin Tang, Sudipto Banerjee

Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data. While numerical approximations through graph representations have been well studied for the Matérn covariogram and heat kernel, the behaviour of asymptotic inference on the parameters of the covariogram has received relatively scant attention. We focus on asymptotic behaviour for Gaussian processes constructed over compact Riemannian manifolds. Building upon a recently introduced Matérn covariogram on a compact Riemannian manifold, we employ formal notions and conditions for the equivalence of two Matérn Gaussian random measures on compact manifolds to derive the parameter that is identifiable, also known as the microergodic parameter, and formally establish the consistency of the maximum likelihood estimate and the asymptotic optimality of the best linear unbiased predictor. The circle is studied as a specific example of compact Riemannian manifolds with numerical experiments to illustrate and corroborate the theory.

高斯过程是空间统计学、函数数据分析、计算机建模和机器学习各种应用中广泛使用的通用建模和预测工具。人们对欧几里得空间上的高斯过程进行了广泛的研究,利用协方差函数或协方差图对复杂的依赖关系进行建模。关于黎曼流形上的高斯过程的文献越来越多,以便为非欧几里得数据开发更丰富、更灵活的推理框架。虽然通过图形表示对马特恩协方差和热核的数值近似进行了深入研究,但对协方差参数的渐近推断行为的关注却相对较少。我们重点研究在紧凑黎曼流形上构建的高斯过程的渐近行为。以最近引入的紧凑黎曼流形上的马特恩协变图为基础,我们采用紧凑流形上两个马特恩高斯随机度量等价的形式化概念和条件,推导出可识别的参数(也称为微角参数),并正式建立最大似然估计的一致性和最佳线性无偏预测器的渐近最优性。我们将圆作为紧凑黎曼流形的一个具体实例进行研究,并通过数值实验来说明和证实这一理论。
{"title":"Inference for Gaussian Processes with Matérn Covariogram on Compact Riemannian Manifolds.","authors":"Didong Li, Wenpin Tang, Sudipto Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data. While numerical approximations through graph representations have been well studied for the Matérn covariogram and heat kernel, the behaviour of asymptotic inference on the parameters of the covariogram has received relatively scant attention. We focus on asymptotic behaviour for Gaussian processes constructed over compact Riemannian manifolds. Building upon a recently introduced Matérn covariogram on a compact Riemannian manifold, we employ formal notions and conditions for the equivalence of two Matérn Gaussian random measures on compact manifolds to derive the parameter that is identifiable, also known as the microergodic parameter, and formally establish the consistency of the maximum likelihood estimate and the asymptotic optimality of the best linear unbiased predictor. The circle is studied as a specific example of compact Riemannian manifolds with numerical experiments to illustrate and corroborate the theory.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361735/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9876354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data. 多元分类数据的维度分组混合隶属度模型。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-02-01
Yuqi Gu, Elena A Erosheva, Gongjun Xu, David B Dunson

Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of Dimension-Grouped MMMs ( Gro- M 3 s ) for multivariate categorical data, which improve parsimony and interpretability. In Gro- M 3 s , observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet Gro- M 3 s to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.

混合隶属度模型(MMMs)是一种流行的复杂多元数据潜在结构模型。mm没有强迫每个主题属于单个集群,而是结合了一个特定主题的权重向量,该权重表示跨集群的部分隶属关系。有了这种灵活性,在唯一地识别、估计和解释参数方面就出现了挑战。在本文中,我们提出了一种新的多维分类数据的维数分组hmm (Gro- m3),它提高了数据的简洁性和可解释性。在Gro- m3中,观察到的变量被划分成组,使得组内变量的潜在隶属度是恒定的,但组间可能不同。传统的潜在类模型是在所有变量都在一组时得到的,而传统的hmm是在每个变量都在自己的组时得到的。新模型对应于一种新的概率张量分解。理论上,我们导出了在一般情况下未知分组结构和模型参数的透明可辨识性条件。在方法上,我们提出了Dirichlet Gro- m3s的贝叶斯方法来推断变量分组结构和估计模型参数。仿真结果显示了良好的计算性能,并从经验上验证了可辨识性结果。我们通过对功能性残疾调查数据集和个性测试数据集的应用来说明新方法。
{"title":"Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data.","authors":"Yuqi Gu, Elena A Erosheva, Gongjun Xu, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of <i>Dimension-Grouped</i> MMMs ( <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> ) for multivariate categorical data, which improve parsimony and interpretability. In <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> , observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12000818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143992849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Data Selection. 贝叶斯数据选择。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-01-01
Eli N Weinstein, Jeffrey W Miller

Insights into complex, high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest. To formalize this task, we introduce the "data selection" problem: finding a lower-dimensional statistic-such as a subset of variables-that is well fit by a given parametric model of interest. A fully Bayesian approach to data selection would be to parametrically model the value of the statistic, nonparametrically model the remaining "background" components of the data, and perform standard Bayesian model selection for the choice of statistic. However, fitting a nonparametric model to high-dimensional data tends to be highly inefficient, statistically and computationally. We propose a novel score for performing data selection, the "Stein volume criterion (SVC)", that does not require fitting a nonparametric model. The SVC takes the form of a generalized marginal likelihood with a kernelized Stein discrepancy in place of the Kullback-Leibler divergence. We prove that the SVC is consistent for data selection, and establish consistency and asymptotic normality of the corresponding generalized posterior on parameters. We apply the SVC to the analysis of single-cell RNA sequencing data sets using probabilistic principal components analysis and a spin glass model of gene regulation.

通过发现与感兴趣的模型匹配或不匹配的数据特征,可以获得对复杂高维数据的洞察。为了形式化这个任务,我们引入了“数据选择”问题:找到一个较低维的统计量——比如变量的子集——它与给定的参数模型很好地拟合。数据选择的完全贝叶斯方法是对统计值进行参数化建模,对数据的剩余“背景”成分进行非参数化建模,并对统计值的选择执行标准贝叶斯模型选择。然而,拟合一个非参数模型到高维数据往往是非常低效的,统计和计算。我们提出了一种用于执行数据选择的新评分,即“Stein体积准则(SVC)”,它不需要拟合非参数模型。SVC采用广义边际似然的形式,用核化的Stein差异代替Kullback-Leibler散度。证明了SVC在数据选择上是一致的,并建立了相应的广义后验在参数上的一致性和渐近正态性。我们使用概率主成分分析和基因调控的自旋玻璃模型将SVC应用于单细胞RNA测序数据集的分析。
{"title":"Bayesian Data Selection.","authors":"Eli N Weinstein,&nbsp;Jeffrey W Miller","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Insights into complex, high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest. To formalize this task, we introduce the \"data selection\" problem: finding a lower-dimensional statistic-such as a subset of variables-that is well fit by a given parametric model of interest. A fully Bayesian approach to data selection would be to parametrically model the value of the statistic, nonparametrically model the remaining \"background\" components of the data, and perform standard Bayesian model selection for the choice of statistic. However, fitting a nonparametric model to high-dimensional data tends to be highly inefficient, statistically and computationally. We propose a novel score for performing data selection, the \"Stein volume criterion (SVC)\", that does not require fitting a nonparametric model. The SVC takes the form of a generalized marginal likelihood with a kernelized Stein discrepancy in place of the Kullback-Leibler divergence. We prove that the SVC is consistent for data selection, and establish consistency and asymptotic normality of the corresponding generalized posterior on parameters. We apply the SVC to the analysis of single-cell RNA sequencing data sets using probabilistic principal components analysis and a spin glass model of gene regulation.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 23","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10194814/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9574086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-Based Causal Discovery for Zero-Inflated Count Data. 零膨胀计数数据的基于模型的因果发现。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-01-01
Junsouk Choi, Yang Ni

Zero-inflated count data arise in a wide range of scientific areas such as social science, biology, and genomics. Very few causal discovery approaches can adequately account for excessive zeros as well as various features of multivariate count data such as overdispersion. In this paper, we propose a new zero-inflated generalized hypergeometric directed acyclic graph (ZiG-DAG) model for inference of causal structure from purely observational zero-inflated count data. The proposed ZiG-DAGs exploit a broad family of generalized hypergeometric probability distributions and are useful for modeling various types of zero-inflated count data with great flexibility. In addition, ZiG-DAGs allow for both linear and nonlinear causal relationships. We prove that the causal structure is identifiable for the proposed ZiG-DAGs via a general proof technique for count data, which is applicable beyond the proposed model for investigating causal identifiability. Score-based algorithms are developed for causal structure learning. Extensive synthetic experiments as well as a real dataset with known ground truth demonstrate the superior performance of the proposed method against state-of-the-art alternative methods in discovering causal structure from observational zero-inflated count data. An application of reverse-engineering a gene regulatory network from a single-cell RNA-sequencing dataset illustrates the utility of ZiG-DAGs in practice.

零膨胀计数数据出现在广泛的科学领域,如社会科学、生物学和基因组学。很少有因果发现方法可以充分解释过多的零以及多变量计数数据的各种特征,如过分散。本文提出了一种新的零膨胀广义超几何有向无环图(zigg - dag)模型,用于从纯观测的零膨胀计数数据推断因果结构。所提出的zigg - dag利用了广泛的广义超几何概率分布,并且非常灵活地用于建模各种类型的零膨胀计数数据。此外,zigg - dag允许线性和非线性因果关系。我们通过计数数据的一般证明技术证明了所提出的zigg - dag的因果结构是可识别的,该技术适用于研究因果可识别性的所提出的模型之外。基于分数的算法被开发用于因果结构学习。广泛的合成实验以及具有已知地面真相的真实数据集证明了所提出的方法在从观测到的零膨胀计数数据中发现因果结构方面优于最先进的替代方法。从单细胞rna测序数据集逆向工程基因调控网络的应用说明了zigg - dag在实践中的效用。
{"title":"Model-Based Causal Discovery for Zero-Inflated Count Data.","authors":"Junsouk Choi, Yang Ni","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Zero-inflated count data arise in a wide range of scientific areas such as social science, biology, and genomics. Very few causal discovery approaches can adequately account for excessive zeros as well as various features of multivariate count data such as overdispersion. In this paper, we propose a new zero-inflated generalized hypergeometric directed acyclic graph (ZiG-DAG) model for inference of causal structure from purely observational zero-inflated count data. The proposed ZiG-DAGs exploit a broad family of generalized hypergeometric probability distributions and are useful for modeling various types of zero-inflated count data with great flexibility. In addition, ZiG-DAGs allow for both linear and nonlinear causal relationships. We prove that the causal structure is identifiable for the proposed ZiG-DAGs via a general proof technique for count data, which is applicable beyond the proposed model for investigating causal identifiability. Score-based algorithms are developed for causal structure learning. Extensive synthetic experiments as well as a real dataset with known ground truth demonstrate the superior performance of the proposed method against state-of-the-art alternative methods in discovering causal structure from observational zero-inflated count data. An application of reverse-engineering a gene regulatory network from a single-cell RNA-sequencing dataset illustrates the utility of ZiG-DAGs in practice.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12337821/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144823118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Machine Learning Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1