首页 > 最新文献

Journal of Machine Learning Research最新文献

英文 中文
A Brief Survey on the Approximation Theory for Sequence Modelling 序列建模的近似理论综述
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.221221
Haotian Jiang, Qianxiao Li, Zhong Li null, Shida Wang
{"title":"A Brief Survey on the Approximation Theory for Sequence Modelling","authors":"Haotian Jiang, Qianxiao Li, Zhong Li null, Shida Wang","doi":"10.4208/jml.221221","DOIUrl":"https://doi.org/10.4208/jml.221221","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135381134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement Learning with Function Approximation: From Linear to Nonlinear 函数逼近的强化学习:从线性到非线性
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.230105
Jihao Long and Jiequn Han
{"title":"Reinforcement Learning with Function Approximation: From Linear to Nonlinear","authors":"Jihao Long and Jiequn Han","doi":"10.4208/jml.230105","DOIUrl":"https://doi.org/10.4208/jml.230105","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135887743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why Self-Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries 为什么自我关注是序列对序列问题的自然表现?从对称角度看问题
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.221206
Chao Ma and Lexing Ying null
{"title":"Why Self-Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries","authors":"Chao Ma and Lexing Ying null","doi":"10.4208/jml.221206","DOIUrl":"https://doi.org/10.4208/jml.221206","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135142632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selective inference for k-means clustering. k-means 聚类的选择性推理。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-05-01
Yiqun T Chen, Daniela M Witten

We consider the problem of testing for a difference in means between clusters of observations identified via k-means clustering. In this setting, classical hypothesis tests lead to an inflated Type I error rate. In recent work, Gao et al. (2022) considered a related problem in the context of hierarchical clustering. Unfortunately, their solution is highly-tailored to the context of hierarchical clustering, and thus cannot be applied in the setting of k-means clustering. In this paper, we propose a p-value that conditions on all of the intermediate clustering assignments in the k-means algorithm. We show that the p-value controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using k-means clustering in finite samples, and can be efficiently computed. We apply our proposal on hand-written digits data and on single-cell RNA-sequencing data.

我们考虑的问题是检验通过 k-means 聚类确定的观测数据聚类之间的均值差异。在这种情况下,经典的假设检验会导致 I 类错误率上升。在最近的工作中,Gao 等人(2022 年)考虑了分层聚类背景下的相关问题。遗憾的是,他们的解决方案与分层聚类的背景高度契合,因此无法应用于 k-means 聚类。在本文中,我们提出了一个 p 值,它是 k-means 算法中所有中间聚类分配的条件。我们证明,该 p 值可以控制在有限样本中使用 k-means 聚类对一对聚类的均值差异进行检验时的选择性 I 类错误,并且可以高效计算。我们将我们的建议应用于手写数字数据和单细胞 RNA 序列数据。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">Selective inference for <ns0:math><ns0:mi>k</ns0:mi></ns0:math>-means clustering.","authors":"Yiqun T Chen, Daniela M Witten","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of testing for a difference in means between clusters of observations identified via <math><mi>k</mi></math>-means clustering. In this setting, classical hypothesis tests lead to an inflated Type I error rate. In recent work, Gao et al. (2022) considered a related problem in the context of hierarchical clustering. Unfortunately, their solution is highly-tailored to the context of hierarchical clustering, and thus cannot be applied in the setting of <math><mi>k</mi></math>-means clustering. In this paper, we propose a p-value that conditions on all of the intermediate clustering assignments in the <math><mi>k</mi></math>-means algorithm. We show that the p-value controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using <math><mi>k</mi></math>-means clustering in finite samples, and can be efficiently computed. We apply our proposal on hand-written digits data and on single-cell RNA-sequencing data.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10805457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139543526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering. 基于贝叶斯模型的聚类中的维数诅咒。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-04-01
Noirrit Kiran Chandra, Antonio Canale, David B Dunson

Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.

贝叶斯混合模型广泛用于高维数据的聚类,并对其进行适当的不确定性量化。然而,随着观察的维度增加,后验推理往往倾向于支持太多或太少的集群。本文通过研究固定样本量和增加数据维数的非标准设置下的随机后验分割来解释这种行为。我们提供了一些条件,在这些条件下,随着维数的增长,有限样本后验倾向于将每个观测值分配到不同的聚类,或者将所有观测值分配到同一聚类。有趣的是,这些条件并不依赖于聚类先验的选择,只要所有可能的观察划分到聚类中都具有正先验概率,并且与真实的数据生成模型无关。然后,我们提出了一类用于贝叶斯聚类(Lamb)的潜在混合在一组低维潜在变量上引起对观测数据的划分。该模型适用于可扩展的后验推理,并且在温和的假设条件下可以避免高维的缺陷。该方法在仿真研究和基于scRNAseq的细胞类型推断中具有良好的性能。
{"title":"Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering.","authors":"Noirrit Kiran Chandra, Antonio Canale, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11999651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144054439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNN-Attention Based Deep Learning for Solving Inverse Boundary Problems in Nonlinear Marshak Waves 基于rnn -注意力的深度学习求解非线性马沙克波反边界问题
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-04-01 DOI: 10.4208/jml.221209
Di Zhao, Weiming Li, Wengu Chen, Peng Song, and Han Wang null
. Radiative transfer, described by the radiative transfer equation (RTE), is one of the dominant energy exchange processes in the inertial confinement fusion (ICF) experiments. The Marshak wave problem is an important benchmark for time-dependent RTE. In this work, we present a neural network architecture termed RNN-attention deep learning (RADL) as a surrogate model to solve the inverse boundary problem of the nonlinear Marshak wave in a data-driven fashion. We train the surrogate model by numerical simulation data of the forward problem, and then solve the inverse problem by minimizing the distance between the target solution and the surrogate predicted solution concerning the boundary condition. This minimization is made efficient because the surrogate model by-passes the expensive numerical solution, and the model is differentiable so the gradient-based optimization algorithms are adopted. The effectiveness of our approach is demonstrated by solving the inverse boundary problems of the Marshak wave benchmark in two case studies: where the transport process is modeled by RTE and where it is modeled by its nonlinear diffusion approximation (DA). Last but not least, the importance of using both the RNN and the factor-attention blocks in the RADL model is illustrated, and the data efficiency of our model is investigated in this work.
。辐射传递是惯性约束聚变(ICF)实验中主要的能量交换过程之一,用辐射传递方程(RTE)来描述。马沙克波问题是时变RTE的一个重要基准。在这项工作中,我们提出了一种称为rnn -注意力深度学习(RADL)的神经网络架构作为代理模型,以数据驱动的方式解决非线性马沙克波的逆边界问题。我们利用正演问题的数值模拟数据训练代理模型,然后在边界条件下通过最小化目标解与代理预测解之间的距离来求解逆问题。由于替代模型绕过了昂贵的数值解,并且模型是可微的,因此采用了基于梯度的优化算法,从而使这种最小化变得高效。通过在两个案例研究中解决马沙克波基准的逆边界问题,我们的方法的有效性得到了证明:其中输运过程是由RTE建模的,而它是由其非线性扩散近似(DA)建模的。最后,说明了在RADL模型中同时使用RNN和因子注意块的重要性,并对我们的模型的数据效率进行了研究。
{"title":"RNN-Attention Based Deep Learning for Solving Inverse Boundary Problems in Nonlinear Marshak Waves","authors":"Di Zhao, Weiming Li, Wengu Chen, Peng Song, and Han Wang null","doi":"10.4208/jml.221209","DOIUrl":"https://doi.org/10.4208/jml.221209","url":null,"abstract":". Radiative transfer, described by the radiative transfer equation (RTE), is one of the dominant energy exchange processes in the inertial confinement fusion (ICF) experiments. The Marshak wave problem is an important benchmark for time-dependent RTE. In this work, we present a neural network architecture termed RNN-attention deep learning (RADL) as a surrogate model to solve the inverse boundary problem of the nonlinear Marshak wave in a data-driven fashion. We train the surrogate model by numerical simulation data of the forward problem, and then solve the inverse problem by minimizing the distance between the target solution and the surrogate predicted solution concerning the boundary condition. This minimization is made efficient because the surrogate model by-passes the expensive numerical solution, and the model is differentiable so the gradient-based optimization algorithms are adopted. The effectiveness of our approach is demonstrated by solving the inverse boundary problems of the Marshak wave benchmark in two case studies: where the transport process is modeled by RTE and where it is modeled by its nonlinear diffusion approximation (DA). Last but not least, the importance of using both the RNN and the factor-attention blocks in the RADL model is illustrated, and the data efficiency of our model is investigated in this work.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"75 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74640699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference for Gaussian Processes with Matérn Covariogram on Compact Riemannian Manifolds. 紧凑黎曼曼形上具有马特恩协方差的高斯过程推理
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-03-01
Didong Li, Wenpin Tang, Sudipto Banerjee

Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data. While numerical approximations through graph representations have been well studied for the Matérn covariogram and heat kernel, the behaviour of asymptotic inference on the parameters of the covariogram has received relatively scant attention. We focus on asymptotic behaviour for Gaussian processes constructed over compact Riemannian manifolds. Building upon a recently introduced Matérn covariogram on a compact Riemannian manifold, we employ formal notions and conditions for the equivalence of two Matérn Gaussian random measures on compact manifolds to derive the parameter that is identifiable, also known as the microergodic parameter, and formally establish the consistency of the maximum likelihood estimate and the asymptotic optimality of the best linear unbiased predictor. The circle is studied as a specific example of compact Riemannian manifolds with numerical experiments to illustrate and corroborate the theory.

高斯过程是空间统计学、函数数据分析、计算机建模和机器学习各种应用中广泛使用的通用建模和预测工具。人们对欧几里得空间上的高斯过程进行了广泛的研究,利用协方差函数或协方差图对复杂的依赖关系进行建模。关于黎曼流形上的高斯过程的文献越来越多,以便为非欧几里得数据开发更丰富、更灵活的推理框架。虽然通过图形表示对马特恩协方差和热核的数值近似进行了深入研究,但对协方差参数的渐近推断行为的关注却相对较少。我们重点研究在紧凑黎曼流形上构建的高斯过程的渐近行为。以最近引入的紧凑黎曼流形上的马特恩协变图为基础,我们采用紧凑流形上两个马特恩高斯随机度量等价的形式化概念和条件,推导出可识别的参数(也称为微角参数),并正式建立最大似然估计的一致性和最佳线性无偏预测器的渐近最优性。我们将圆作为紧凑黎曼流形的一个具体实例进行研究,并通过数值实验来说明和证实这一理论。
{"title":"Inference for Gaussian Processes with Matérn Covariogram on Compact Riemannian Manifolds.","authors":"Didong Li, Wenpin Tang, Sudipto Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data. While numerical approximations through graph representations have been well studied for the Matérn covariogram and heat kernel, the behaviour of asymptotic inference on the parameters of the covariogram has received relatively scant attention. We focus on asymptotic behaviour for Gaussian processes constructed over compact Riemannian manifolds. Building upon a recently introduced Matérn covariogram on a compact Riemannian manifold, we employ formal notions and conditions for the equivalence of two Matérn Gaussian random measures on compact manifolds to derive the parameter that is identifiable, also known as the microergodic parameter, and formally establish the consistency of the maximum likelihood estimate and the asymptotic optimality of the best linear unbiased predictor. The circle is studied as a specific example of compact Riemannian manifolds with numerical experiments to illustrate and corroborate the theory.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361735/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9876354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data. 多元分类数据的维度分组混合隶属度模型。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-02-01
Yuqi Gu, Elena A Erosheva, Gongjun Xu, David B Dunson

Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of Dimension-Grouped MMMs ( Gro- M 3 s ) for multivariate categorical data, which improve parsimony and interpretability. In Gro- M 3 s , observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet Gro- M 3 s to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.

混合隶属度模型(MMMs)是一种流行的复杂多元数据潜在结构模型。mm没有强迫每个主题属于单个集群,而是结合了一个特定主题的权重向量,该权重表示跨集群的部分隶属关系。有了这种灵活性,在唯一地识别、估计和解释参数方面就出现了挑战。在本文中,我们提出了一种新的多维分类数据的维数分组hmm (Gro- m3),它提高了数据的简洁性和可解释性。在Gro- m3中,观察到的变量被划分成组,使得组内变量的潜在隶属度是恒定的,但组间可能不同。传统的潜在类模型是在所有变量都在一组时得到的,而传统的hmm是在每个变量都在自己的组时得到的。新模型对应于一种新的概率张量分解。理论上,我们导出了在一般情况下未知分组结构和模型参数的透明可辨识性条件。在方法上,我们提出了Dirichlet Gro- m3s的贝叶斯方法来推断变量分组结构和估计模型参数。仿真结果显示了良好的计算性能,并从经验上验证了可辨识性结果。我们通过对功能性残疾调查数据集和个性测试数据集的应用来说明新方法。
{"title":"Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data.","authors":"Yuqi Gu, Elena A Erosheva, Gongjun Xu, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of <i>Dimension-Grouped</i> MMMs ( <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> ) for multivariate categorical data, which improve parsimony and interpretability. In <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> , observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet <math><mrow><mtext>Gro-</mtext> <msup><mtext>M</mtext> <mn>3</mn></msup> <mtext>s</mtext></mrow> </math> to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12000818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143992849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Data Selection. 贝叶斯数据选择。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-01-01
Eli N Weinstein, Jeffrey W Miller

Insights into complex, high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest. To formalize this task, we introduce the "data selection" problem: finding a lower-dimensional statistic-such as a subset of variables-that is well fit by a given parametric model of interest. A fully Bayesian approach to data selection would be to parametrically model the value of the statistic, nonparametrically model the remaining "background" components of the data, and perform standard Bayesian model selection for the choice of statistic. However, fitting a nonparametric model to high-dimensional data tends to be highly inefficient, statistically and computationally. We propose a novel score for performing data selection, the "Stein volume criterion (SVC)", that does not require fitting a nonparametric model. The SVC takes the form of a generalized marginal likelihood with a kernelized Stein discrepancy in place of the Kullback-Leibler divergence. We prove that the SVC is consistent for data selection, and establish consistency and asymptotic normality of the corresponding generalized posterior on parameters. We apply the SVC to the analysis of single-cell RNA sequencing data sets using probabilistic principal components analysis and a spin glass model of gene regulation.

通过发现与感兴趣的模型匹配或不匹配的数据特征,可以获得对复杂高维数据的洞察。为了形式化这个任务,我们引入了“数据选择”问题:找到一个较低维的统计量——比如变量的子集——它与给定的参数模型很好地拟合。数据选择的完全贝叶斯方法是对统计值进行参数化建模,对数据的剩余“背景”成分进行非参数化建模,并对统计值的选择执行标准贝叶斯模型选择。然而,拟合一个非参数模型到高维数据往往是非常低效的,统计和计算。我们提出了一种用于执行数据选择的新评分,即“Stein体积准则(SVC)”,它不需要拟合非参数模型。SVC采用广义边际似然的形式,用核化的Stein差异代替Kullback-Leibler散度。证明了SVC在数据选择上是一致的,并建立了相应的广义后验在参数上的一致性和渐近正态性。我们使用概率主成分分析和基因调控的自旋玻璃模型将SVC应用于单细胞RNA测序数据集的分析。
{"title":"Bayesian Data Selection.","authors":"Eli N Weinstein,&nbsp;Jeffrey W Miller","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Insights into complex, high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest. To formalize this task, we introduce the \"data selection\" problem: finding a lower-dimensional statistic-such as a subset of variables-that is well fit by a given parametric model of interest. A fully Bayesian approach to data selection would be to parametrically model the value of the statistic, nonparametrically model the remaining \"background\" components of the data, and perform standard Bayesian model selection for the choice of statistic. However, fitting a nonparametric model to high-dimensional data tends to be highly inefficient, statistically and computationally. We propose a novel score for performing data selection, the \"Stein volume criterion (SVC)\", that does not require fitting a nonparametric model. The SVC takes the form of a generalized marginal likelihood with a kernelized Stein discrepancy in place of the Kullback-Leibler divergence. We prove that the SVC is consistent for data selection, and establish consistency and asymptotic normality of the corresponding generalized posterior on parameters. We apply the SVC to the analysis of single-cell RNA sequencing data sets using probabilistic principal components analysis and a spin glass model of gene regulation.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 23","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10194814/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9574086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-Supervised Off-Policy Reinforcement Learning and Value Estimation for Dynamic Treatment Regimes. 动态治疗机制的半监督非策略强化学习与值估计。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-01-01
Aaron Sonabend-W, Nilanjana Laha, Ashwin N Ananthakrishnan, Tianxi Cai, Rajarshi Mukherjee

Reinforcement learning (RL) has shown great promise in estimating dynamic treatment regimes which take into account patient heterogeneity. However, health-outcome information, used as the reward for RL methods, is often not well coded but rather embedded in clinical notes. Extracting precise outcome information is a resource-intensive task, so most of the available well-annotated cohorts are small. To address this issue, we propose a semi-supervised learning (SSL) approach that efficiently leverages a small-sized labeled data set with actual outcomes observed and a large unlabeled data set with outcome surrogates. In particular, we propose a semi-supervised, efficient approach to Q-learning and doubly robust off-policy value estimation. Generalizing SSL to dynamic treatment regimes brings interesting challenges: 1) Feature distribution for Q-learning is unknown as it includes previous outcomes. 2) The surrogate variables we leverage in the modified SSL framework are predictive of the outcome but not informative of the optimal policy or value function. We provide theoretical results for our Q function and value function estimators to understand the degree of efficiency gained from SSL. Our method is at least as efficient as the supervised approach, and robust to bias from mis-specification of the imputation models.

强化学习(RL)在评估考虑患者异质性的动态治疗方案方面显示出很大的希望。然而,作为RL方法奖励的健康结果信息通常没有很好地编码,而是嵌入在临床记录中。提取精确的结果信息是一项资源密集型任务,因此大多数可用的注释良好的队列都很小。为了解决这个问题,我们提出了一种半监督学习(SSL)方法,该方法有效地利用了具有实际观察结果的小型标记数据集和具有结果替代品的大型未标记数据集。特别地,我们提出了一种半监督的、有效的方法来进行q学习和双鲁棒的离策略值估计。将SSL推广到动态处理机制带来了有趣的挑战:1)q学习的特征分布是未知的,因为它包括以前的结果。2)我们在修改后的SSL框架中使用的替代变量可以预测结果,但不能提供最优策略或价值函数的信息。我们为我们的Q函数和值函数估计器提供了理论结果,以了解从SSL获得的效率程度。我们的方法至少与监督方法一样有效,并且对来自输入模型的错误规范的偏差具有鲁棒性。
{"title":"Semi-Supervised Off-Policy Reinforcement Learning and Value Estimation for Dynamic Treatment Regimes.","authors":"Aaron Sonabend-W, Nilanjana Laha, Ashwin N Ananthakrishnan, Tianxi Cai, Rajarshi Mukherjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Reinforcement learning (RL) has shown great promise in estimating dynamic treatment regimes which take into account patient heterogeneity. However, health-outcome information, used as the reward for RL methods, is often not well coded but rather embedded in clinical notes. Extracting precise outcome information is a resource-intensive task, so most of the available well-annotated cohorts are small. To address this issue, we propose a semi-supervised learning (SSL) approach that efficiently leverages a small-sized labeled data set with actual outcomes observed and a large unlabeled data set with outcome surrogates. In particular, we propose a semi-supervised, efficient approach to <i>Q</i>-learning and doubly robust off-policy value estimation. Generalizing SSL to dynamic treatment regimes brings interesting challenges: 1) Feature distribution for <i>Q</i>-learning is unknown as it includes previous outcomes. 2) The surrogate variables we leverage in the modified SSL framework are predictive of the outcome but not informative of the optimal policy or value function. We provide theoretical results for our <i>Q</i> function and value function estimators to understand the degree of efficiency gained from SSL. Our method is at least as efficient as the supervised approach, and robust to bias from mis-specification of the imputation models.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12843220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Machine Learning Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1