首页 > 最新文献

Journal of Machine Learning Research最新文献

英文 中文
Spatial meshing for general Bayesian multivariate models. 一般贝叶斯多元模型的空间网格划分。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-03-01
Michele Peruzzi, David B Dunson

Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.

通过贝叶斯层次模型中的空间随机效应,可以对不同类型的多变量地理定位数据中的空间和/或时间关联进行量化,但是在我们关注的日益常见的大规模数据设置中,当空间依赖性被编码为潜在高斯过程(GP)时,会出现严重的计算瓶颈。这种情况在非高斯模型中更糟,因为降低的分析可追溯性导致了计算效率的额外障碍。在本文中,我们介绍了空间参考数据的贝叶斯模型,其中似然或潜在过程(或两者)不是高斯的。首先,我们利用由有向无环图构建的空间过程的优势,在这种情况下,空间节点进入贝叶斯层次,并通过常规的马尔可夫链蒙特卡罗(MCMC)方法导致后验采样。其次,考虑到我们所关注的基于梯度的流行采样方法在多变量环境下可能存在的低效率,我们引入了简化的流形预调节器自适应(SiMPA)算法,该算法使用目标的二阶信息,但避免了昂贵的矩阵运算。我们展示了我们的方法相对于广泛的合成和现实世界遥感和社区生态应用中的替代方法的性能和效率改进,这些方法具有多达数十万个空间位置和多达数十个结果的大规模数据。所提出的方法的软件是R包网格的一部分,可以在CRAN上获得。
{"title":"Spatial meshing for general Bayesian multivariate models.","authors":"Michele Peruzzi, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12237421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144592821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect-Invariant Mechanisms for Policy Generalization. 政策通用化的效应不变机制。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters

Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.

策略学习是现实世界中许多学习系统的重要组成部分。策略学习的一个主要挑战是如何有效地适应未知环境或任务。最近,有人建议利用不变条件分布来学习模型,以便更好地泛化到未知环境中。然而,假设整个条件分布不变(我们称之为完全不变)在实践中可能是一个太强的假设。在本文中,我们介绍了完全不变性的一种松弛,称为效应不变性(简称 e-不变性),并证明在合适的假设条件下,它足以实现零次策略泛化。我们还讨论了一种扩展方法,当我们从测试环境中获得少量样本时,可以利用 e-invariance 实现少次策略泛化。我们的工作没有假设底层因果图,也没有假设数据是由结构因果模型生成的;相反,我们开发了测试程序,直接从数据中测试电子不变量。我们使用模拟数据和移动健康干预数据集展示了实证结果,以证明我们方法的有效性。
{"title":"Effect-Invariant Mechanisms for Policy Generalization.","authors":"Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11286230/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models. 混合模型中非参数极大似然估计的高效可扩展计算。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Yangjing Zhang, Ying Cui, Bodhisattva Sen, Kim-Chuan Toh

In this paper, we focus on the computation of the nonparametric maximum likelihood estimator (NPMLE) in multivariate mixture models. Our approach discretizes this infinite dimensional convex optimization problem by setting fixed support points for the NPMLE and optimizing over the mixing proportions. We propose an efficient and scalable semismooth Newton based augmented Lagrangian method (ALM). Our algorithm outperforms the state-of-the-art methods (Kim et al., 2020; Koenker and Gu, 2017), capable of handling n 10 6 data points with m 10 4 support points. A key advantage of our approach is its strategic utilization of the solution's sparsity, leading to structured sparsity in Hessian computations. As a result, our algorithm demonstrates better scaling in terms of m when compared to the mixsqp method (Kim et al., 2020). The computed NPMLE can be directly applied to denoising the observations in the framework of empirical Bayes. We propose new denoising estimands in this context along with their consistent estimates. Extensive numerical experiments are conducted to illustrate the efficiency of our ALM. In particular, we employ our method to analyze two astronomy data sets: (i) Gaia-TGAS Catalog (Anderson et al., 2018) containing approximately 1.4 × 106 data points in two dimensions, and (ii) a data set from the APOGEE survey (Majewski et al., 2017) with approximately 2.7 × 104 data points.

本文主要研究多元混合模型中非参数极大似然估计量的计算。我们的方法通过为NPMLE设置固定支撑点并对混合比例进行优化来离散化这个无限维凸优化问题。提出了一种高效、可扩展的半光滑牛顿增广拉格朗日方法。我们的算法优于最先进的方法(Kim等人,2020;Koenker和Gu, 2017),能够处理n≈10 6个数据点和m≈10 4个支撑点。我们的方法的一个关键优势是它战略性地利用了解决方案的稀疏性,从而在Hessian计算中实现了结构化的稀疏性。因此,与mixsqp方法相比,我们的算法在m方面表现出更好的缩放(Kim et al., 2020)。计算得到的NPMLE可以直接应用于经验贝叶斯框架下的观测值去噪。在这种情况下,我们提出了新的去噪估计以及它们的一致估计。通过大量的数值实验验证了该算法的有效性。特别地,我们使用我们的方法分析了两个天文学数据集:(i) gaya - tgas目录(Anderson et al., 2018)包含大约1.4 × 106个数据点的二维数据点,以及(ii) APOGEE调查数据集(Majewski et al., 2017)包含大约2.7 × 104个数据点。
{"title":"On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models.","authors":"Yangjing Zhang, Ying Cui, Bodhisattva Sen, Kim-Chuan Toh","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In this paper, we focus on the computation of the nonparametric maximum likelihood estimator (NPMLE) in multivariate mixture models. Our approach discretizes this infinite dimensional convex optimization problem by setting fixed support points for the NPMLE and optimizing over the mixing proportions. We propose an efficient and scalable semismooth Newton based augmented Lagrangian method (ALM). Our algorithm outperforms the state-of-the-art methods (Kim et al., 2020; Koenker and Gu, 2017), capable of handling <math><mrow><mi>n</mi> <mo>≈</mo> <msup><mn>10</mn> <mn>6</mn></msup> </mrow> </math> data points with <math><mrow><mi>m</mi> <mo>≈</mo> <msup><mn>10</mn> <mn>4</mn></msup> </mrow> </math> support points. A key advantage of our approach is its strategic utilization of the solution's sparsity, leading to structured sparsity in Hessian computations. As a result, our algorithm demonstrates better scaling in terms of <math><mi>m</mi></math> when compared to the mixsqp method (Kim et al., 2020). The computed NPMLE can be directly applied to denoising the observations in the framework of empirical Bayes. We propose new denoising estimands in this context along with their consistent estimates. Extensive numerical experiments are conducted to illustrate the efficiency of our ALM. In particular, we employ our method to analyze two astronomy data sets: (i) Gaia-TGAS Catalog (Anderson et al., 2018) containing approximately 1.4 × 10<sup>6</sup> data points in two dimensions, and (ii) a data set from the APOGEE survey (Majewski et al., 2017) with approximately 2.7 × 10<sup>4</sup> data points.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12975118/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST). 具有不确定性量化的迁移学习:源到目标的随机效应校准。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Jimmy Hickey, Jonathan P Williams, Emily C Hector

Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called RECaST. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model, and does not require access to source data. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.

迁移学习使用一种数据模型,经过训练可以对来自一个群体的数据进行预测或推断,从而对来自另一个群体的数据进行可靠的预测或推断。大多数现有的迁移学习方法都是基于微调预训练的神经网络模型,无法提供关键的不确定性量化。我们开发了一个基于迁移学习的模型预测统计框架,称为RECaST。主要机制是柯西随机效应,它将源模型重新校准为目标群体;我们在数学上和经验上证明了我们的RECaST方法在线性模型之间迁移学习的有效性,因为预测集将达到其标称声明的覆盖范围,并且我们在数值上说明了该方法对非线性模型的渐近逼近的鲁棒性。尽管许多现有技术都是基于特定的源模型构建的,但RECaST与源模型的选择无关,并且不需要访问源数据。例如,我们的RECaST迁移学习方法可以应用于具有线性或逻辑回归、深度神经网络架构等的连续或离散数据模型。此外,RECaST为预测提供了不确定性量化,这在文献中大多是缺失的。我们在模拟研究和实际医院数据的应用中检验了我们的方法的性能。
{"title":"Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST).","authors":"Jimmy Hickey, Jonathan P Williams, Emily C Hector","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called <i>RECaST</i>. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model, and does not require access to source data. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700631/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric Regression for 3D Point Cloud Learning. 用于 3D 点云学习的非参数回归。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Li Wang, Ming-Jun Lai

In recent years, there has been an exponentially increased amount of point clouds collected with irregular shapes in various areas. Motivated by the importance of solid modeling for point clouds, we develop a novel and efficient smoothing tool based on multivariate splines over the triangulation to extract the underlying signal and build up a 3D solid model from the point cloud. The proposed method can denoise or deblur the point cloud effectively, provide a multi-resolution reconstruction of the actual signal, and handle sparse and irregularly distributed point clouds to recover the underlying trajectory. In addition, our method provides a natural way of numerosity data reduction. We establish the theoretical guarantees of the proposed method, including the convergence rate and asymptotic normality of the estimator, and show that the convergence rate achieves optimal nonparametric convergence. We also introduce a bootstrap method to quantify the uncertainty of the estimators. Through extensive simulation studies and a real data example, we demonstrate the superiority of the proposed method over traditional smoothing methods in terms of estimation accuracy and efficiency of data reduction.

近年来,在各个领域收集到的不规则形状的点云数量呈指数级增长。鉴于实体模型对点云的重要性,我们开发了一种基于三角剖分的多元样条的新型高效平滑工具,以提取底层信号并从点云中建立三维实体模型。所提出的方法能有效地对点云进行去噪或去模糊处理,提供实际信号的多分辨率重建,并能处理稀疏和不规则分布的点云,从而恢复底层轨迹。此外,我们的方法还提供了一种减少数值数据的自然方法。我们建立了所提方法的理论保证,包括估计器的收敛速率和渐近正态性,并证明收敛速率达到了最佳非参数收敛。我们还引入了一种自举方法来量化估计器的不确定性。通过大量的模拟研究和真实数据实例,我们证明了所提出的方法在估计精度和数据缩减效率方面优于传统的平滑方法。
{"title":"Nonparametric Regression for 3D Point Cloud Learning.","authors":"Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Li Wang, Ming-Jun Lai","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In recent years, there has been an exponentially increased amount of point clouds collected with irregular shapes in various areas. Motivated by the importance of solid modeling for point clouds, we develop a novel and efficient smoothing tool based on multivariate splines over the triangulation to extract the underlying signal and build up a 3D solid model from the point cloud. The proposed method can denoise or deblur the point cloud effectively, provide a multi-resolution reconstruction of the actual signal, and handle sparse and irregularly distributed point clouds to recover the underlying trajectory. In addition, our method provides a natural way of numerosity data reduction. We establish the theoretical guarantees of the proposed method, including the convergence rate and asymptotic normality of the estimator, and show that the convergence rate achieves optimal nonparametric convergence. We also introduce a bootstrap method to quantify the uncertainty of the estimators. Through extensive simulation studies and a real data example, we demonstrate the superiority of the proposed method over traditional smoothing methods in terms of estimation accuracy and efficiency of data reduction.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11465206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142401809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Discovery with Generalized Linear Models through Peeling Algorithms. 基于剥离算法的广义线性模型因果发现。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Minjie Wang, Xiaotong Shen, Wei Pan

This article presents a novel method for causal discovery with generalized structural equation models suited for analyzing diverse types of outcomes, including discrete, continuous, and mixed data. Causal discovery often faces challenges due to unmeasured confounders that hinder the identification of causal relationships. The proposed approach addresses this issue by developing two peeling algorithms (bottom-up and top-down) to ascertain causal relationships and valid instruments. This approach first reconstructs a super-graph to represent ancestral relationships between variables, using a peeling algorithm based on nodewise GLM regressions that exploit relationships between primary and instrumental variables. Then, it estimates parent-child effects from the ancestral relationships using another peeling algorithm while deconfounding a child's model with information borrowed from its parents' models. The article offers a theoretical analysis of the proposed approach, establishing conditions for model identifiability and providing statistical guarantees for accurately discovering parent-child relationships via the peeling algorithms. Furthermore, the article presents numerical experiments showcasing the effectiveness of our approach in comparison to state-of-the-art structure learning methods without confounders. Lastly, it demonstrates an application to Alzheimer's disease (AD), highlighting the method's utility in constructing gene-to-gene and gene-to-disease regulatory networks involving Single Nucleotide Polymorphisms (SNPs) for healthy and AD subjects.

本文提出了一种新的方法,适用于分析不同类型的结果,包括离散,连续和混合数据的广义结构方程模型的因果发现。由于无法测量的混杂因素阻碍了因果关系的识别,因果发现经常面临挑战。提出的方法通过开发两种剥离算法(自下而上和自上而下)来确定因果关系和有效工具来解决这一问题。该方法首先重建一个超级图来表示变量之间的祖先关系,使用基于节点的GLM回归的剥离算法,该算法利用主要变量和工具变量之间的关系。然后,它使用另一种剥离算法从祖先关系中估计亲子效应,同时用从父母模型中借来的信息解构孩子的模型。本文对本文提出的方法进行了理论分析,建立了模型可识别的条件,并为通过剥离算法准确发现亲子关系提供了统计保证。此外,本文还介绍了数值实验,与没有混杂因素的最先进的结构学习方法相比,展示了我们的方法的有效性。最后,它展示了在阿尔茨海默病(AD)中的应用,突出了该方法在构建涉及健康和阿尔茨海默病受试者的单核苷酸多态性(snp)的基因到基因和基因到疾病调控网络中的实用性。
{"title":"Causal Discovery with Generalized Linear Models through Peeling Algorithms.","authors":"Minjie Wang, Xiaotong Shen, Wei Pan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This article presents a novel method for causal discovery with generalized structural equation models suited for analyzing diverse types of outcomes, including discrete, continuous, and mixed data. Causal discovery often faces challenges due to unmeasured confounders that hinder the identification of causal relationships. The proposed approach addresses this issue by developing two peeling algorithms (bottom-up and top-down) to ascertain causal relationships and valid instruments. This approach first reconstructs a super-graph to represent ancestral relationships between variables, using a peeling algorithm based on nodewise GLM regressions that exploit relationships between primary and instrumental variables. Then, it estimates parent-child effects from the ancestral relationships using another peeling algorithm while deconfounding a child's model with information borrowed from its parents' models. The article offers a theoretical analysis of the proposed approach, establishing conditions for model identifiability and providing statistical guarantees for accurately discovering parent-child relationships via the peeling algorithms. Furthermore, the article presents numerical experiments showcasing the effectiveness of our approach in comparison to state-of-the-art structure learning methods without confounders. Lastly, it demonstrates an application to Alzheimer's disease (AD), highlighting the method's utility in constructing gene-to-gene and gene-to-disease regulatory networks involving Single Nucleotide Polymorphisms (SNPs) for healthy and AD subjects.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data. 用于对不可交换分组数据进行聚类的图形 Dirichlet Process。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Arhit Chakrabarti, Yang Ni, Ellen Ruth A Morris, Michael L Salinas, Robert S Chapkin, Bani K Mallick

We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.

研究了一类可能不可交换的群的聚类问题,这些群的依赖关系可以用已知的有向无环图来表征。为了在不可交换的群之间共享簇,我们提出了一种称为图形狄利克雷过程的贝叶斯非参数方法,该方法通过假设每个随机测度作为狄利克雷过程分布,其浓度参数和基本概率测度依赖于其父群的浓度参数和基本概率测度,从而联合建模依赖于特定组的随机测度。所得到的联合随机过程尊重连接群的有向无环图的马尔可夫性质。我们使用一种新的超图表示、断棒表示、餐馆类型表示和有限混合模型的极限表示来表征图形狄利克雷过程。我们开发了一种有效的后验推理算法,并通过模拟和真实的分组单细胞数据集来说明我们的模型。
{"title":"Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data.","authors":"Arhit Chakrabarti, Yang Ni, Ellen Ruth A Morris, Michael L Salinas, Robert S Chapkin, Bani K Mallick","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11650374/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls. 学习最优的动态治疗方案服从阶段性风险控制。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Mochuan Liu, Yuanjia Wang, Haoda Fu, Donglin Zeng

Dynamic treatment regimens (DTRs) aim at tailoring individualized sequential treatment rules that maximize cumulative beneficial outcomes by accommodating patients' heterogeneity in decision-making. For many chronic diseases including type 2 diabetes mellitus (T2D), treatments are usually multifaceted in the sense that aggressive treatments with a higher expected reward are also likely to elevate the risk of acute adverse events. In this paper, we propose a new weighted learning framework, namely benefit-risk dynamic treatment regimens (BR-DTRs), to address the benefit-risk trade-off. The new framework relies on a backward learning procedure by restricting the induced risk of the treatment rule to be no larger than a pre-specified risk constraint at each treatment stage. Computationally, the estimated treatment rule solves a weighted support vector machine problem with a modified smooth constraint. Theoretically, we show that the proposed DTRs are Fisher consistent, and we further obtain the convergence rates for both the value and risk functions. Finally, the performance of the proposed method is demonstrated via extensive simulation studies and application to a real study for T2D patients.

动态治疗方案(dtr)旨在定制个性化的顺序治疗规则,通过适应患者决策的异质性,使累积有益结果最大化。对于包括2型糖尿病(T2D)在内的许多慢性疾病,治疗通常是多方面的,因为具有较高预期回报的积极治疗也可能增加急性不良事件的风险。在本文中,我们提出了一个新的加权学习框架,即收益-风险动态治疗方案(BR-DTRs),以解决收益-风险权衡。新框架通过限制治疗规则的诱导风险不大于每个治疗阶段预先指定的风险约束,依赖于向后学习过程。在计算上,该估计处理规则解决了一个带有改进光滑约束的加权支持向量机问题。理论上,我们证明了所提出的dtr是Fisher一致的,并进一步得到了价值函数和风险函数的收敛速率。最后,通过广泛的模拟研究和应用于T2D患者的实际研究,证明了所提出方法的性能。
{"title":"Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls.","authors":"Mochuan Liu, Yuanjia Wang, Haoda Fu, Donglin Zeng","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Dynamic treatment regimens (DTRs) aim at tailoring individualized sequential treatment rules that maximize cumulative beneficial outcomes by accommodating patients' heterogeneity in decision-making. For many chronic diseases including type 2 diabetes mellitus (T2D), treatments are usually multifaceted in the sense that aggressive treatments with a higher expected reward are also likely to elevate the risk of acute adverse events. In this paper, we propose a new weighted learning framework, namely benefit-risk dynamic treatment regimens (BR-DTRs), to address the benefit-risk trade-off. The new framework relies on a backward learning procedure by restricting the induced risk of the treatment rule to be no larger than a pre-specified risk constraint at each treatment stage. Computationally, the estimated treatment rule solves a weighted support vector machine problem with a modified smooth constraint. Theoretically, we show that the proposed DTRs are Fisher consistent, and we further obtain the convergence rates for both the value and risk functions. Finally, the performance of the proposed method is demonstrated via extensive simulation studies and application to a real study for T2D patients.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12711320/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence for nonconvex ADMM, with applications to CT imaging. 非凸 ADMM 的收敛性,并应用于 CT 成像。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Rina Foygel Barber, Emil Y Sidky

The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form m i n { f ( x ) + g ( y ) : A x + B y = c } . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions f and g , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both f and g are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.

交替乘数方向法(ADMM)算法是一种强大而灵活的工具,可用于解决形式为 m i n { f ( x ) + g ( y ) : A x + B y = c } 的复杂优化问题。.ADMM 在目标函数 f 和 g 的非光滑性和非凸性等一系列挑战性设置中表现出稳健的经验性能,为计算机断层扫描 (CT) 成像的图像重建逆问题提供了一种简单而自然的方法。从理论角度来看,现有的非凸环境下的收敛结果一般都假设目标函数中至少有一个分量函数是平滑的。在这项工作中,我们的新理论结果提供了在受限强凸假设下的收敛保证,而不要求平滑性或可微性,同时还允许在需要时近似处理可微项。我们通过一个 f 和 g 都不可微的模拟例子(因此超出了现有理论的范围)以及一个模拟 CT 图像重建问题,对这些理论结果进行了经验验证。
{"title":"Convergence for nonconvex ADMM, with applications to CT imaging.","authors":"Rina Foygel Barber, Emil Y Sidky","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form <math><mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>{</mo> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>+</mo> <mi>g</mi> <mo>(</mo> <mi>y</mi> <mo>)</mo> <mspace></mspace> <mo>:</mo> <mspace></mspace> <mi>A</mi> <mi>x</mi> <mo>+</mo> <mi>B</mi> <mi>y</mi> <mo>=</mo> <mi>c</mi> <mo>}</mo></math> . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions <math><mi>f</mi></math> and <math><mi>g</mi></math> , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both <math><mi>f</mi></math> and <math><mi>g</mi></math> are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141297149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Framework for Improving the Reliability of Black-box Variational Inference. 提高黑盒变分推理可靠性的框架。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Manushi Welandawe, Michael Riis Andersen, Aki Vehtari, Jonathan H Huggins

Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics as a fast yet flexible alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, stochastic optimization methods for BBVI remain unreliable and require substantial expertise and hand-tuning to apply effectively. In this paper, we propose robust and automated black-box VI (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation. RABVI adaptively decreases the learning rate by detecting convergence of the fixed-learning-rate iterates, then estimates the symmetrized Kullback-Leibler (KL) divergence between the current variational approximation and the optimal one. It also employs a novel optimization termination criterion that enables the user to balance desired accuracy against computational cost by comparing (i) the predicted relative decrease in the symmetrized KL divergence if a smaller learning were used and (ii) the predicted computation required to converge with the smaller learning rate. We validate the robustness and accuracy of RABVI through carefully designed simulation studies and on a diverse set of real-world model and data examples.

黑盒变分推理(BBVI)现在广泛应用于机器学习和统计学中,作为近似贝叶斯推理的马尔可夫链蒙特卡罗方法的快速而灵活的替代方法。然而,BBVI的随机优化方法仍然不可靠,需要大量的专业知识和手工调整才能有效地应用。本文提出了一种提高BBVI优化可靠性的框架——鲁棒和自动化黑盒VI (RABVI)。RABVI基于严格证明的自动化技术,只包括少量直观的调优参数,并检测最优变分近似的不准确估计。RABVI通过检测固定学习率迭代的收敛性自适应降低学习率,然后估计当前变分近似与最优近似之间的对称Kullback-Leibler (KL)散度。它还采用了一种新的优化终止准则,使用户能够通过比较(i)如果使用较小的学习,对称KL散度的预测相对减少以及(ii)收敛于较小学习率所需的预测计算来平衡所需的准确性和计算成本。我们通过精心设计的仿真研究和各种现实世界的模型和数据示例验证RABVI的鲁棒性和准确性。
{"title":"A Framework for Improving the Reliability of Black-box Variational Inference.","authors":"Manushi Welandawe, Michael Riis Andersen, Aki Vehtari, Jonathan H Huggins","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics as a fast yet flexible alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, stochastic optimization methods for BBVI remain unreliable and require substantial expertise and hand-tuning to apply effectively. In this paper, we propose <i>robust and automated black-box VI</i> (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation. RABVI adaptively decreases the learning rate by detecting convergence of the fixed-learning-rate iterates, then estimates the symmetrized Kullback-Leibler (KL) divergence between the current variational approximation and the optimal one. It also employs a novel optimization termination criterion that enables the user to balance desired accuracy against computational cost by comparing (i) the predicted relative decrease in the symmetrized KL divergence if a smaller learning were used and (ii) the predicted computation required to converge with the smaller learning rate. We validate the robustness and accuracy of RABVI through carefully designed simulation studies and on a diverse set of real-world model and data examples.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 219","pages":"1-71"},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12668294/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Machine Learning Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1