Journal of Machine Learning Research最新文献

Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models. 时空机制模型的动态贝叶斯学习。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Sudipto Banerjee, Xiang Chen, Ian Frankenburg, Daniel Zhou

We develop an approach for Bayesian learning of spatiotemporal dynamical mechanistic models. Such learning consists of statistical emulation of the mechanistic system that can efficiently interpolate the output of the system from arbitrary inputs. The emulated learner can then be used to train the system from noisy data achieved by melding information from observed data with the emulated mechanistic system. This joint melding of mechanistic systems employ hierarchical state-space models with Gaussian process regression. Assuming the dynamical system is controlled by a finite collection of inputs, Gaussian process regression learns the effect of these parameters through a number of training runs, driving the stochastic innovations of the spatiotemporal state-space component. This enables efficient modeling of the dynamics over space and time. This article details exact inference with analytically accessible posterior distributions in hierarchical matrix-variate Normal and Wishart models in designing the emulator. This step obviates expensive iterative algorithms such as Markov chain Monte Carlo or variational approximations. We also show how emulation is applicable to large-scale emulation by designing a dynamic Bayesian transfer learning framework. Inference on $η$ proceeds using Markov chain Monte Carlo as a post-emulation step using the emulator as a regression component. We demonstrate this framework through solving inverse problems arising in the analysis of ordinary and partial nonlinear differential equations and, in addition, to a black-box computer model generating spatiotemporal dynamics across a graphical model.

我们开发了一种时空动态机制模型的贝叶斯学习方法。这种学习包括对机械系统的统计仿真，该系统可以有效地从任意输入插入系统的输出。然后，模拟的学习器可以用于从噪声数据中训练系统，这些噪声数据是通过将观测数据中的信息与模拟的机械系统融合而得到的。这种机械系统的联合融合采用高斯过程回归的分层状态空间模型。假设动力系统由有限的输入集合控制，高斯过程回归通过一系列的训练运行来学习这些参数的影响，从而驱动时空状态-空间分量的随机创新。这使得可以有效地对空间和时间上的动态进行建模。本文在仿真器的设计中详细介绍了层次矩阵变量正态和Wishart模型中可解析后验分布的精确推理。这一步避免了昂贵的迭代算法，如马尔可夫链蒙特卡罗或变分近似。我们还通过设计一个动态贝叶斯迁移学习框架来展示仿真如何适用于大规模仿真。对η的推断使用马尔可夫链蒙特卡罗作为后仿真步骤，使用仿真器作为回归组件。我们通过解决在分析常非线性和偏非线性微分方程中出现的反问题，以及通过图形模型生成时空动态的黑箱计算机模型来演示该框架。

{"title":"Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models.","authors":"Sudipto Banerjee, Xiang Chen, Ian Frankenburg, Daniel Zhou","doi":"","DOIUrl":"","url":null,"abstract":"We develop an approach for Bayesian learning of spatiotemporal dynamical mechanistic models. Such learning consists of statistical emulation of the mechanistic system that can efficiently interpolate the output of the system from arbitrary inputs. The emulated learner can then be used to train the system from noisy data achieved by melding information from observed data with the emulated mechanistic system. This joint melding of mechanistic systems employ hierarchical state-space models with Gaussian process regression. Assuming the dynamical system is controlled by a finite collection of inputs, Gaussian process regression learns the effect of these parameters through a number of training runs, driving the stochastic innovations of the spatiotemporal state-space component. This enables efficient modeling of the dynamics over space and time. This article details exact inference with analytically accessible posterior distributions in hierarchical matrix-variate Normal and Wishart models in designing the emulator. This step obviates expensive iterative algorithms such as Markov chain Monte Carlo or variational approximations. We also show how emulation is applicable to large-scale emulation by designing a dynamic Bayesian transfer learning framework. Inference on <math><mi>η</mi></math> proceeds using Markov chain Monte Carlo as a post-emulation step using the emulator as a regression component. We demonstrate this framework through solving inverse problems arising in the analysis of ordinary and partial nonlinear differential equations and, in addition, to a black-box computer model generating spatiotemporal dynamics across a graphical model.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"26 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12676262/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data. 异构组结构数据的贝叶斯多组高斯过程模型。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Didong Li, Andrew Jones, Sudipto Banerjee, Barbara Engelhardt

Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific data are often heterogeneous in their inputs and contain multiple known discrete groups of samples; thus, it is desirable to leverage the similarity among groups while accounting for heterogeneity across groups. We propose multi-group Gaussian processes (MGGPs) defined over $R^{p} \times 𝒞$ , where $𝒞$ is a finite set representing the group label, by developing general classes of valid (positive definite) covariance functions on such domains. MGGPs are able to accurately recover relationships between the groups and efficiently share strength across samples from all groups during inference, while capturing distinct group-specific behaviors in the conditional posterior distributions. We demonstrate inference in MGGPs through simulation experiments, and we apply our proposed MGGP regression framework to gene expression data to illustrate the behavior and enhanced inferential capabilities of multi-group Gaussian processes by jointly modeling continuous and categorical variables.

高斯过程在功能数据分析、机器学习和复杂依赖关系建模的空间统计中无处不在。科学数据的输入通常是异构的，并且包含多个已知的离散样本组；因此，在考虑组间异质性的同时，利用组间的相似性是可取的。我们通过在这些域上建立有效（正定）协方差函数的一般类，提出了定义在R p x上的多群高斯过程（MGGPs），其中的是表示群标记的有限集合。mggp能够准确地恢复组之间的关系，并在推理过程中有效地在所有组的样本之间共享强度，同时在条件后验分布中捕获不同的组特定行为。我们通过模拟实验证明了MGGP中的推理，并将我们提出的MGGP回归框架应用于基因表达数据，通过联合建模连续变量和分类变量来说明多组高斯过程的行为和增强的推理能力。

{"title":"Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data.","authors":"Didong Li, Andrew Jones, Sudipto Banerjee, Barbara Engelhardt","doi":"","DOIUrl":"","url":null,"abstract":"Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific data are often heterogeneous in their inputs and contain multiple known discrete groups of samples; thus, it is desirable to leverage the similarity among groups while accounting for heterogeneity across groups. We propose multi-group Gaussian processes (MGGPs) defined over <math> <msup><mrow><mi>R</mi></mrow> <mrow><mi>p</mi></mrow> </msup> <mo>×</mo> <mi>𝒞</mi></math> , where <math><mi>𝒞</mi></math> is a finite set representing the group label, by developing general classes of valid (positive definite) covariance functions on such domains. MGGPs are able to accurately recover relationships between the groups and efficiently share strength across samples from all groups during inference, while capturing distinct group-specific behaviors in the conditional posterior distributions. We demonstrate inference in MGGPs through simulation experiments, and we apply our proposed MGGP regression framework to gene expression data to illustrate the behavior and enhanced inferential capabilities of multi-group Gaussian processes by jointly modeling continuous and categorical variables.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"26 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12463451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145187362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DisC²o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. 用于分析现实世界高维数据的协变量移位的分布式因果推理。

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Jiayi Tong, Jie Hu, George Hripcsak, Yang Ning, Yong Chen

High-dimensional healthcare data, such as electronic health records (EHR) data and claims data, present two primary challenges due to the large number of variables and the need to consolidate data from multiple clinical sites. The third key challenge is the potential existence of heterogeneity in terms of covariate shift. In this paper, we propose a distributed learning algorithm accounting for covariate shift to estimate the average treatment effect (ATE) for high-dimensional data, named DisC²o-HD. Leveraging the surrogate likelihood method, our method calibrates the estimates of the propensity score and outcome models to approximately attain the desired covariate balancing property, while accounting for the covariate shift across multiple clinical sites. We show that our distributed covariate balancing propensity score estimator can approximate the pooled estimator, which is obtained by pooling the data from multiple sites together. The proposed estimator remains consistent if either the propensity score model or the outcome regression model is correctly specified. The semiparametric efficiency bound is achieved when both the propensity score and the outcome models are correctly specified. We conduct simulation studies to demonstrate the performance of the proposed algorithm; additionally, we apply the algorithm to a real-world data set to present the readiness of implementation and validity.

高维医疗保健数据，如电子健康记录（EHR）数据和索赔数据，由于存在大量变量和需要整合来自多个临床站点的数据，带来了两个主要挑战。第三个关键挑战是协变量移位方面异质性的潜在存在。在本文中，我们提出了一个考虑协变量移位的分布式学习算法来估计高维数据的平均处理效果（ATE），命名为disc20 - hd。利用替代似然法，我们的方法校准了倾向评分和结果模型的估计，以近似地达到期望的协变量平衡特性，同时考虑了多个临床地点的协变量转移。我们证明了我们的分布协变量平衡倾向得分估计量可以近似于由多个站点的数据池化而得到的池化估计量。如果倾向得分模型或结果回归模型被正确指定，所提出的估计量保持一致。当倾向得分和结果模型都正确指定时，可以实现半参数效率界。我们进行了仿真研究，以证明所提出算法的性能；此外，我们将算法应用于现实世界的数据集，以展示实现的准备和有效性。

{"title":"DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data.","authors":"Jiayi Tong, Jie Hu, George Hripcsak, Yang Ning, Yong Chen","doi":"","DOIUrl":"","url":null,"abstract":"High-dimensional healthcare data, such as electronic health records (EHR) data and claims data, present two primary challenges due to the large number of variables and the need to consolidate data from multiple clinical sites. The third key challenge is the potential existence of heterogeneity in terms of covariate shift. In this paper, we propose a distributed learning algorithm accounting for covariate shift to estimate the average treatment effect (ATE) for high-dimensional data, named DisC2o-HD. Leveraging the surrogate likelihood method, our method calibrates the estimates of the propensity score and outcome models to approximately attain the desired covariate balancing property, while accounting for the covariate shift across multiple clinical sites. We show that our distributed covariate balancing propensity score estimator can approximate the pooled estimator, which is obtained by pooling the data from multiple sites together. The proposed estimator remains consistent if either the propensity score model or the outcome regression model is correctly specified. The semiparametric efficiency bound is achieved when both the propensity score and the outcome models are correctly specified. We conduct simulation studies to demonstrate the performance of the proposed algorithm; additionally, we apply the algorithm to a real-world data set to present the readiness of implementation and validity.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"26 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12269483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144660933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Data Sketching for Varying Coefficient Regression Models. 变系数回归模型的贝叶斯数据草图。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Rajarshi Guhaniyogi, Laura Baracaldo, Sudipto Banerjee

Varying coefficient models are popular for estimating nonlinear regression functions in functional data models. Their Bayesian variants have received limited attention in large data applications, primarily due to prohibitively slow posterior computations using Markov chain Monte Carlo (MCMC) algorithms. We introduce Bayesian data sketching for varying coefficient models to obviate computational challenges presented by large sample sizes. To address the challenges of analyzing large data, we compress the functional response vector and predictor matrix by a random linear transformation to achieve dimension reduction and conduct inference on the compressed data. Our approach distinguishes itself from several existing methods for analyzing large functional data in that it requires neither the development of new models or algorithms, nor any specialized computational hardware while delivering fully model-based Bayesian inference. Well-established methods and algorithms for varying coefficient regression models can be applied to the compressed data. We establish posterior contraction rates for estimating the varying coefficients and predicting the outcome at new locations with the randomly compressed data model. We use simulation experiments and analyze remote sensed vegetation data to empirically illustrate the inferential and computational efficiency of our approach.

变系数模型是函数数据模型中估计非线性回归函数的常用方法。它们的贝叶斯变体在大数据应用中受到了有限的关注，主要是由于使用马尔可夫链蒙特卡罗（MCMC）算法的后验计算过于缓慢。我们为变系数模型引入贝叶斯数据草图，以避免大样本量带来的计算挑战。为了解决大数据分析的难题，我们通过随机线性变换对功能响应向量和预测矩阵进行压缩，实现降维，并对压缩后的数据进行推理。我们的方法与现有的几种分析大型功能数据的方法不同，因为它既不需要开发新的模型或算法，也不需要任何专门的计算硬件，同时提供完全基于模型的贝叶斯推理。成熟的变系数回归模型方法和算法可以应用于压缩数据。我们用随机压缩的数据模型建立了后验收缩率，用于估计变化系数和预测新位置的结果。通过模拟实验和遥感植被数据分析，实证证明了该方法的推理效率和计算效率。

{"title":"Bayesian Data Sketching for Varying Coefficient Regression Models.","authors":"Rajarshi Guhaniyogi, Laura Baracaldo, Sudipto Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"Varying coefficient models are popular for estimating nonlinear regression functions in functional data models. Their Bayesian variants have received limited attention in large data applications, primarily due to prohibitively slow posterior computations using Markov chain Monte Carlo (MCMC) algorithms. We introduce Bayesian data sketching for varying coefficient models to obviate computational challenges presented by large sample sizes. To address the challenges of analyzing large data, we compress the functional response vector and predictor matrix by a random linear transformation to achieve dimension reduction and conduct inference on the compressed data. Our approach distinguishes itself from several existing methods for analyzing large functional data in that it requires neither the development of new models or algorithms, nor any specialized computational hardware while delivering fully model-based Bayesian inference. Well-established methods and algorithms for varying coefficient regression models can be applied to the compressed data. We establish posterior contraction rates for estimating the varying coefficients and predicting the outcome at new locations with the randomly compressed data model. We use simulation experiments and analyze remote sensed vegetation data to empirically illustrate the inferential and computational efficiency of our approach.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"26 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12666391/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response. 具有部分注释处理和响应的平均处理效果的有效和鲁棒半监督估计。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Jue Hou, Rajarshi Mukherjee, Tianxi Cai

A notable challenge of leveraging Electronic Health Records (EHR) for treatment effect assessment is the lack of precise information on important clinical variables, including the treatment received and the response. Both treatment information and response cannot be accurately captured by readily available EHR features in many studies and require labor-intensive manual chart review to precisely annotate, which limits the number of available gold standard labels on these key variables. We considered average treatment effect (ATE) estimation when 1) exact treatment and outcome variables are only observed together in a small labeled subset and 2) noisy surrogates of treatment and outcome, such as relevant prescription and diagnosis codes, along with potential confounders are observed for all subjects. We derived the efficient influence function for ATE and used it to construct a semi-supervised multiple machine learning (SMMAL) estimator. We justified that our SMMAL ATE estimator is semi-parametric efficient with B-spline regression under low-dimensional smooth models. We developed the adaptive sparsity/model doubly robust estimation under high-dimensional logistic propensity score and outcome regression models. Results from simulation studies demonstrated the validity of our SMMAL method and its superiority over supervised and unsupervised benchmarks. We applied SMMAL to the assessment of targeted therapies for metastatic colorectal cancer in comparison to chemotherapy.

利用电子健康记录（EHR）进行治疗效果评估的一个显著挑战是缺乏关于重要临床变量的精确信息，包括所接受的治疗和反应。在许多研究中，治疗信息和反应都不能通过现成的EHR特征准确捕获，并且需要劳动密集型的手动图表审查来精确注释，这限制了这些关键变量上可用金标准标签的数量。在以下情况下，我们考虑平均治疗效果（ATE）估计：1)仅在一个小的标记子集中同时观察到确切的治疗和结果变量；2)观察到所有受试者的治疗和结果的嘈杂替代变量，如相关处方和诊断代码，以及潜在的混杂因素。我们推导了ATE的有效影响函数，并用它构造了一个半监督多机器学习（SMMAL）估计量。我们证明了我们的SMMAL ATE估计器在低维光滑模型下是半参数有效的b样条回归。我们在高维逻辑倾向评分和结果回归模型下开发了自适应稀疏度/模型双稳健估计。仿真研究的结果证明了我们的SMMAL方法的有效性及其优于有监督和无监督基准的优越性。我们将SMMAL应用于评估转移性结直肠癌的靶向治疗与化疗的比较。

{"title":"Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response.","authors":"Jue Hou, Rajarshi Mukherjee, Tianxi Cai","doi":"","DOIUrl":"","url":null,"abstract":"A notable challenge of leveraging Electronic Health Records (EHR) for treatment effect assessment is the lack of precise information on important clinical variables, including the treatment received and the response. Both treatment information and response cannot be accurately captured by readily available EHR features in many studies and require labor-intensive manual chart review to precisely annotate, which limits the number of available gold standard labels on these key variables. We considered average treatment effect (ATE) estimation when 1) exact treatment and outcome variables are only observed together in a small labeled subset and 2) noisy surrogates of treatment and outcome, such as relevant prescription and diagnosis codes, along with potential confounders are observed for all subjects. We derived the efficient influence function for ATE and used it to construct a semi-supervised multiple machine learning (SMMAL) estimator. We justified that our SMMAL ATE estimator is semi-parametric efficient with B-spline regression under low-dimensional smooth models. We developed the adaptive sparsity/model doubly robust estimation under high-dimensional logistic propensity score and outcome regression models. Results from simulation studies demonstrated the validity of our SMMAL method and its superiority over supervised and unsupervised benchmarks. We applied SMMAL to the assessment of targeted therapies for metastatic colorectal cancer in comparison to chemotherapy.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"26 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12671556/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables. 利用工具变量从纵向数据中同时发现时间滞后和瞬时因果关系的有向循环图。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2025-01-01

Wei Jin, Yang Ni, Amanda B Spence, Leah H Rubin, Yanxun Xu

We consider the problem of causal discovery from longitudinal observational data. We develop a novel framework that simultaneously discovers the time-lagged causality and the possibly cyclic instantaneous causality. Under common causal discovery assumptions, combined with additional instrumental information typically available in longitudinal data, we prove the proposed model is generally identifiable. To the best of our knowledge, this is the first causal identification theory for directed graphs with general cyclic patterns that achieves unique causal identifiability. Structural learning is carried out in a fully Bayesian fashion. Through extensive simulations and an application to the Women's Interagency HIV Study, we demonstrate the identifiability, utility, and superiority of the proposed model against state-of-the-art alternative methods.

我们考虑从纵向观测数据中发现因果关系的问题。我们开发了一个新的框架，可以同时发现时间滞后的因果关系和可能循环的瞬时因果关系。在常见的因果发现假设下，结合通常在纵向数据中可用的额外工具信息，我们证明了所提出的模型通常是可识别的。据我们所知，这是第一个对具有一般循环模式的有向图实现唯一因果可识别性的因果识别理论。结构学习以完全贝叶斯的方式进行。通过广泛的模拟和对妇女跨机构艾滋病毒研究的应用，我们证明了与最先进的替代方法相比，所提出的模型的可识别性、实用性和优越性。

引用次数: 0

Flexible Bayesian Product Mixture Models for Vector Autoregressions. 灵活的贝叶斯向量自回归产品混合物模型

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2024-04-01

Suprateek Kundu, Joshua Lukemire

Bayesian non-parametric methods based on Dirichlet process mixtures have seen tremendous success in various domains and are appealing in being able to borrow information by clustering samples that share identical parameters. However, such methods can face hurdles in heterogeneous settings where objects are expected to cluster only along a subset of axes or where clusters of samples share only a subset of identical parameters. We overcome such limitations by developing a novel class of product of Dirichlet process location-scale mixtures that enables independent clustering at multiple scales, which results in varying levels of information sharing across samples. First, we develop the approach for independent multivariate data. Subsequently we generalize it to multivariate time-series data under the framework of multi-subject Vector Autoregressive (VAR) models that is our primary focus, which go beyond parametric single-subject VAR models. We establish posterior consistency and develop efficient posterior computation for implementation. Extensive numerical studies involving VAR models show distinct advantages over competing methods in terms of estimation, clustering, and feature selection accuracy. Our resting state fMRI analysis from the Human Connectome Project reveals biologically interpretable connectivity differences between distinct intelligence groups, while another air pollution application illustrates the superior forecasting accuracy compared to alternate methods.

基于Dirichlet过程混合的贝叶斯非参数方法在各个领域都取得了巨大的成功，并且能够通过聚类共享相同参数的样本来获取信息。然而，这种方法在异质环境中可能面临障碍，在异质环境中，期望对象仅沿着轴的子集聚集，或者样本集群仅共享相同参数的子集。我们通过开发一种新的狄利克雷过程位置尺度混合物的产品来克服这些限制，该产品能够在多个尺度上独立聚类，从而导致不同水平的样本信息共享。首先，我们开发了独立多元数据的方法。随后，我们将其推广到多主体向量自回归（VAR）模型框架下的多变量时间序列数据，这是我们的重点，它超越了参数化的单主体VAR模型。我们建立了后验一致性，并开发了有效的后验计算实现。大量涉及VAR模型的数值研究表明，在估计、聚类和特征选择准确性方面，VAR模型比其他竞争方法有明显的优势。我们对人类连接组项目的静息状态fMRI分析揭示了不同智力群体之间生物学上可解释的连接差异，而另一个空气污染应用表明，与其他方法相比，预测准确性更高。

{"title":"Flexible Bayesian Product Mixture Models for Vector Autoregressions.","authors":"Suprateek Kundu, Joshua Lukemire","doi":"","DOIUrl":"","url":null,"abstract":"Bayesian non-parametric methods based on Dirichlet process mixtures have seen tremendous success in various domains and are appealing in being able to borrow information by clustering samples that share identical parameters. However, such methods can face hurdles in heterogeneous settings where objects are expected to cluster only along a subset of axes or where clusters of samples share only a subset of identical parameters. We overcome such limitations by developing a novel class of product of Dirichlet process location-scale mixtures that enables independent clustering at multiple scales, which results in varying levels of information sharing across samples. First, we develop the approach for independent multivariate data. Subsequently we generalize it to multivariate time-series data under the framework of multi-subject Vector Autoregressive (VAR) models that is our primary focus, which go beyond parametric single-subject VAR models. We establish posterior consistency and develop efficient posterior computation for implementation. Extensive numerical studies involving VAR models show distinct advantages over competing methods in terms of estimation, clustering, and feature selection accuracy. Our resting state fMRI analysis from the Human Connectome Project reveals biologically interpretable connectivity differences between distinct intelligence groups, while another air pollution application illustrates the superior forecasting accuracy compared to alternate methods.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646655/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatial meshing for general Bayesian multivariate models. 一般贝叶斯多元模型的空间网格划分。

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2024-03-01

Michele Peruzzi, David B Dunson

Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.

通过贝叶斯层次模型中的空间随机效应，可以对不同类型的多变量地理定位数据中的空间和/或时间关联进行量化，但是在我们关注的日益常见的大规模数据设置中，当空间依赖性被编码为潜在高斯过程（GP）时，会出现严重的计算瓶颈。这种情况在非高斯模型中更糟，因为降低的分析可追溯性导致了计算效率的额外障碍。在本文中，我们介绍了空间参考数据的贝叶斯模型，其中似然或潜在过程（或两者）不是高斯的。首先，我们利用由有向无环图构建的空间过程的优势，在这种情况下，空间节点进入贝叶斯层次，并通过常规的马尔可夫链蒙特卡罗（MCMC）方法导致后验采样。其次，考虑到我们所关注的基于梯度的流行采样方法在多变量环境下可能存在的低效率，我们引入了简化的流形预调节器自适应（SiMPA）算法，该算法使用目标的二阶信息，但避免了昂贵的矩阵运算。我们展示了我们的方法相对于广泛的合成和现实世界遥感和社区生态应用中的替代方法的性能和效率改进，这些方法具有多达数十万个空间位置和多达数十个结果的大规模数据。所提出的方法的软件是R包网格的一部分，可以在CRAN上获得。

{"title":"Spatial meshing for general Bayesian multivariate models.","authors":"Michele Peruzzi, David B Dunson","doi":"","DOIUrl":"","url":null,"abstract":"Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12237421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144592821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effect-Invariant Mechanisms for Policy Generalization. 政策通用化的效应不变机制。

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2024-01-01

Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters

Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.

策略学习是现实世界中许多学习系统的重要组成部分。策略学习的一个主要挑战是如何有效地适应未知环境或任务。最近，有人建议利用不变条件分布来学习模型，以便更好地泛化到未知环境中。然而，假设整个条件分布不变（我们称之为完全不变）在实践中可能是一个太强的假设。在本文中，我们介绍了完全不变性的一种松弛，称为效应不变性（简称 e-不变性），并证明在合适的假设条件下，它足以实现零次策略泛化。我们还讨论了一种扩展方法，当我们从测试环境中获得少量样本时，可以利用 e-invariance 实现少次策略泛化。我们的工作没有假设底层因果图，也没有假设数据是由结构因果模型生成的；相反，我们开发了测试程序，直接从数据中测试电子不变量。我们使用模拟数据和移动健康干预数据集展示了实证结果，以证明我们方法的有效性。

{"title":"Effect-Invariant Mechanisms for Policy Generalization.","authors":"Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters","doi":"","DOIUrl":"","url":null,"abstract":"Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11286230/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nonparametric Regression for 3D Point Cloud Learning. 用于 3D 点云学习的非参数回归。

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research

Pub Date : 2024-01-01

Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Li Wang, Ming-Jun Lai

In recent years, there has been an exponentially increased amount of point clouds collected with irregular shapes in various areas. Motivated by the importance of solid modeling for point clouds, we develop a novel and efficient smoothing tool based on multivariate splines over the triangulation to extract the underlying signal and build up a 3D solid model from the point cloud. The proposed method can denoise or deblur the point cloud effectively, provide a multi-resolution reconstruction of the actual signal, and handle sparse and irregularly distributed point clouds to recover the underlying trajectory. In addition, our method provides a natural way of numerosity data reduction. We establish the theoretical guarantees of the proposed method, including the convergence rate and asymptotic normality of the estimator, and show that the convergence rate achieves optimal nonparametric convergence. We also introduce a bootstrap method to quantify the uncertainty of the estimators. Through extensive simulation studies and a real data example, we demonstrate the superiority of the proposed method over traditional smoothing methods in terms of estimation accuracy and efficiency of data reduction.

近年来，在各个领域收集到的不规则形状的点云数量呈指数级增长。鉴于实体模型对点云的重要性，我们开发了一种基于三角剖分的多元样条的新型高效平滑工具，以提取底层信号并从点云中建立三维实体模型。所提出的方法能有效地对点云进行去噪或去模糊处理，提供实际信号的多分辨率重建，并能处理稀疏和不规则分布的点云，从而恢复底层轨迹。此外，我们的方法还提供了一种减少数值数据的自然方法。我们建立了所提方法的理论保证，包括估计器的收敛速率和渐近正态性，并证明收敛速率达到了最佳非参数收敛。我们还引入了一种自举方法来量化估计器的不确定性。通过大量的模拟研究和真实数据实例，我们证明了所提出的方法在估计精度和数据缩减效率方面优于传统的平滑方法。

{"title":"Nonparametric Regression for 3D Point Cloud Learning.","authors":"Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Li Wang, Ming-Jun Lai","doi":"","DOIUrl":"","url":null,"abstract":"In recent years, there has been an exponentially increased amount of point clouds collected with irregular shapes in various areas. Motivated by the importance of solid modeling for point clouds, we develop a novel and efficient smoothing tool based on multivariate splines over the triangulation to extract the underlying signal and build up a 3D solid model from the point cloud. The proposed method can denoise or deblur the point cloud effectively, provide a multi-resolution reconstruction of the actual signal, and handle sparse and irregularly distributed point clouds to recover the underlying trajectory. In addition, our method provides a natural way of numerosity data reduction. We establish the theoretical guarantees of the proposed method, including the convergence rate and asymptotic normality of the estimator, and show that the convergence rate achieves optimal nonparametric convergence. We also introduce a bootstrap method to quantify the uncertainty of the estimators. Through extensive simulation studies and a real data example, we demonstrate the superiority of the proposed method over traditional smoothing methods in terms of estimation accuracy and efficiency of data reduction.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11465206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142401809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0