首页 > 最新文献

Journal of Machine Learning Research最新文献

英文 中文
Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST). 具有不确定性量化的迁移学习:源到目标的随机效应校准。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Jimmy Hickey, Jonathan P Williams, Emily C Hector

Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called RECaST. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model, and does not require access to source data. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.

迁移学习使用一种数据模型,经过训练可以对来自一个群体的数据进行预测或推断,从而对来自另一个群体的数据进行可靠的预测或推断。大多数现有的迁移学习方法都是基于微调预训练的神经网络模型,无法提供关键的不确定性量化。我们开发了一个基于迁移学习的模型预测统计框架,称为RECaST。主要机制是柯西随机效应,它将源模型重新校准为目标群体;我们在数学上和经验上证明了我们的RECaST方法在线性模型之间迁移学习的有效性,因为预测集将达到其标称声明的覆盖范围,并且我们在数值上说明了该方法对非线性模型的渐近逼近的鲁棒性。尽管许多现有技术都是基于特定的源模型构建的,但RECaST与源模型的选择无关,并且不需要访问源数据。例如,我们的RECaST迁移学习方法可以应用于具有线性或逻辑回归、深度神经网络架构等的连续或离散数据模型。此外,RECaST为预测提供了不确定性量化,这在文献中大多是缺失的。我们在模拟研究和实际医院数据的应用中检验了我们的方法的性能。
{"title":"Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST).","authors":"Jimmy Hickey, Jonathan P Williams, Emily C Hector","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called <i>RECaST</i>. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model, and does not require access to source data. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700631/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Discovery with Generalized Linear Models through Peeling Algorithms. 基于剥离算法的广义线性模型因果发现。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Minjie Wang, Xiaotong Shen, Wei Pan

This article presents a novel method for causal discovery with generalized structural equation models suited for analyzing diverse types of outcomes, including discrete, continuous, and mixed data. Causal discovery often faces challenges due to unmeasured confounders that hinder the identification of causal relationships. The proposed approach addresses this issue by developing two peeling algorithms (bottom-up and top-down) to ascertain causal relationships and valid instruments. This approach first reconstructs a super-graph to represent ancestral relationships between variables, using a peeling algorithm based on nodewise GLM regressions that exploit relationships between primary and instrumental variables. Then, it estimates parent-child effects from the ancestral relationships using another peeling algorithm while deconfounding a child's model with information borrowed from its parents' models. The article offers a theoretical analysis of the proposed approach, establishing conditions for model identifiability and providing statistical guarantees for accurately discovering parent-child relationships via the peeling algorithms. Furthermore, the article presents numerical experiments showcasing the effectiveness of our approach in comparison to state-of-the-art structure learning methods without confounders. Lastly, it demonstrates an application to Alzheimer's disease (AD), highlighting the method's utility in constructing gene-to-gene and gene-to-disease regulatory networks involving Single Nucleotide Polymorphisms (SNPs) for healthy and AD subjects.

本文提出了一种新的方法,适用于分析不同类型的结果,包括离散,连续和混合数据的广义结构方程模型的因果发现。由于无法测量的混杂因素阻碍了因果关系的识别,因果发现经常面临挑战。提出的方法通过开发两种剥离算法(自下而上和自上而下)来确定因果关系和有效工具来解决这一问题。该方法首先重建一个超级图来表示变量之间的祖先关系,使用基于节点的GLM回归的剥离算法,该算法利用主要变量和工具变量之间的关系。然后,它使用另一种剥离算法从祖先关系中估计亲子效应,同时用从父母模型中借来的信息解构孩子的模型。本文对本文提出的方法进行了理论分析,建立了模型可识别的条件,并为通过剥离算法准确发现亲子关系提供了统计保证。此外,本文还介绍了数值实验,与没有混杂因素的最先进的结构学习方法相比,展示了我们的方法的有效性。最后,它展示了在阿尔茨海默病(AD)中的应用,突出了该方法在构建涉及健康和阿尔茨海默病受试者的单核苷酸多态性(snp)的基因到基因和基因到疾病调控网络中的实用性。
{"title":"Causal Discovery with Generalized Linear Models through Peeling Algorithms.","authors":"Minjie Wang, Xiaotong Shen, Wei Pan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This article presents a novel method for causal discovery with generalized structural equation models suited for analyzing diverse types of outcomes, including discrete, continuous, and mixed data. Causal discovery often faces challenges due to unmeasured confounders that hinder the identification of causal relationships. The proposed approach addresses this issue by developing two peeling algorithms (bottom-up and top-down) to ascertain causal relationships and valid instruments. This approach first reconstructs a super-graph to represent ancestral relationships between variables, using a peeling algorithm based on nodewise GLM regressions that exploit relationships between primary and instrumental variables. Then, it estimates parent-child effects from the ancestral relationships using another peeling algorithm while deconfounding a child's model with information borrowed from its parents' models. The article offers a theoretical analysis of the proposed approach, establishing conditions for model identifiability and providing statistical guarantees for accurately discovering parent-child relationships via the peeling algorithms. Furthermore, the article presents numerical experiments showcasing the effectiveness of our approach in comparison to state-of-the-art structure learning methods without confounders. Lastly, it demonstrates an application to Alzheimer's disease (AD), highlighting the method's utility in constructing gene-to-gene and gene-to-disease regulatory networks involving Single Nucleotide Polymorphisms (SNPs) for healthy and AD subjects.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data. 用于对不可交换分组数据进行聚类的图形 Dirichlet Process。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Arhit Chakrabarti, Yang Ni, Ellen Ruth A Morris, Michael L Salinas, Robert S Chapkin, Bani K Mallick

We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.

研究了一类可能不可交换的群的聚类问题,这些群的依赖关系可以用已知的有向无环图来表征。为了在不可交换的群之间共享簇,我们提出了一种称为图形狄利克雷过程的贝叶斯非参数方法,该方法通过假设每个随机测度作为狄利克雷过程分布,其浓度参数和基本概率测度依赖于其父群的浓度参数和基本概率测度,从而联合建模依赖于特定组的随机测度。所得到的联合随机过程尊重连接群的有向无环图的马尔可夫性质。我们使用一种新的超图表示、断棒表示、餐馆类型表示和有限混合模型的极限表示来表征图形狄利克雷过程。我们开发了一种有效的后验推理算法,并通过模拟和真实的分组单细胞数据集来说明我们的模型。
{"title":"Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data.","authors":"Arhit Chakrabarti, Yang Ni, Ellen Ruth A Morris, Michael L Salinas, Robert S Chapkin, Bani K Mallick","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11650374/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence for nonconvex ADMM, with applications to CT imaging. 非凸 ADMM 的收敛性,并应用于 CT 成像。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Rina Foygel Barber, Emil Y Sidky

The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form m i n { f ( x ) + g ( y ) : A x + B y = c } . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions f and g , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both f and g are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.

交替乘数方向法(ADMM)算法是一种强大而灵活的工具,可用于解决形式为 m i n { f ( x ) + g ( y ) : A x + B y = c } 的复杂优化问题。.ADMM 在目标函数 f 和 g 的非光滑性和非凸性等一系列挑战性设置中表现出稳健的经验性能,为计算机断层扫描 (CT) 成像的图像重建逆问题提供了一种简单而自然的方法。从理论角度来看,现有的非凸环境下的收敛结果一般都假设目标函数中至少有一个分量函数是平滑的。在这项工作中,我们的新理论结果提供了在受限强凸假设下的收敛保证,而不要求平滑性或可微性,同时还允许在需要时近似处理可微项。我们通过一个 f 和 g 都不可微的模拟例子(因此超出了现有理论的范围)以及一个模拟 CT 图像重建问题,对这些理论结果进行了经验验证。
{"title":"Convergence for nonconvex ADMM, with applications to CT imaging.","authors":"Rina Foygel Barber, Emil Y Sidky","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form <math><mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>{</mo> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>+</mo> <mi>g</mi> <mo>(</mo> <mi>y</mi> <mo>)</mo> <mspace></mspace> <mo>:</mo> <mspace></mspace> <mi>A</mi> <mi>x</mi> <mo>+</mo> <mi>B</mi> <mi>y</mi> <mo>=</mo> <mi>c</mi> <mo>}</mo></math> . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions <math><mi>f</mi></math> and <math><mi>g</mi></math> , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both <math><mi>f</mi></math> and <math><mi>g</mi></math> are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141297149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Framework for Improving the Reliability of Black-box Variational Inference. 提高黑盒变分推理可靠性的框架。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Manushi Welandawe, Michael Riis Andersen, Aki Vehtari, Jonathan H Huggins

Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics as a fast yet flexible alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, stochastic optimization methods for BBVI remain unreliable and require substantial expertise and hand-tuning to apply effectively. In this paper, we propose robust and automated black-box VI (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation. RABVI adaptively decreases the learning rate by detecting convergence of the fixed-learning-rate iterates, then estimates the symmetrized Kullback-Leibler (KL) divergence between the current variational approximation and the optimal one. It also employs a novel optimization termination criterion that enables the user to balance desired accuracy against computational cost by comparing (i) the predicted relative decrease in the symmetrized KL divergence if a smaller learning were used and (ii) the predicted computation required to converge with the smaller learning rate. We validate the robustness and accuracy of RABVI through carefully designed simulation studies and on a diverse set of real-world model and data examples.

黑盒变分推理(BBVI)现在广泛应用于机器学习和统计学中,作为近似贝叶斯推理的马尔可夫链蒙特卡罗方法的快速而灵活的替代方法。然而,BBVI的随机优化方法仍然不可靠,需要大量的专业知识和手工调整才能有效地应用。本文提出了一种提高BBVI优化可靠性的框架——鲁棒和自动化黑盒VI (RABVI)。RABVI基于严格证明的自动化技术,只包括少量直观的调优参数,并检测最优变分近似的不准确估计。RABVI通过检测固定学习率迭代的收敛性自适应降低学习率,然后估计当前变分近似与最优近似之间的对称Kullback-Leibler (KL)散度。它还采用了一种新的优化终止准则,使用户能够通过比较(i)如果使用较小的学习,对称KL散度的预测相对减少以及(ii)收敛于较小学习率所需的预测计算来平衡所需的准确性和计算成本。我们通过精心设计的仿真研究和各种现实世界的模型和数据示例验证RABVI的鲁棒性和准确性。
{"title":"A Framework for Improving the Reliability of Black-box Variational Inference.","authors":"Manushi Welandawe, Michael Riis Andersen, Aki Vehtari, Jonathan H Huggins","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Black-box variational inference (BBVI) now sees widespread use in machine learning and statistics as a fast yet flexible alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, stochastic optimization methods for BBVI remain unreliable and require substantial expertise and hand-tuning to apply effectively. In this paper, we propose <i>robust and automated black-box VI</i> (RABVI), a framework for improving the reliability of BBVI optimization. RABVI is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation. RABVI adaptively decreases the learning rate by detecting convergence of the fixed-learning-rate iterates, then estimates the symmetrized Kullback-Leibler (KL) divergence between the current variational approximation and the optimal one. It also employs a novel optimization termination criterion that enables the user to balance desired accuracy against computational cost by comparing (i) the predicted relative decrease in the symmetrized KL divergence if a smaller learning were used and (ii) the predicted computation required to converge with the smaller learning rate. We validate the robustness and accuracy of RABVI through carefully designed simulation studies and on a diverse set of real-world model and data examples.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 219","pages":"1-71"},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12668294/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls. 学习最优的动态治疗方案服从阶段性风险控制。
IF 5.2 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Mochuan Liu, Yuanjia Wang, Haoda Fu, Donglin Zeng

Dynamic treatment regimens (DTRs) aim at tailoring individualized sequential treatment rules that maximize cumulative beneficial outcomes by accommodating patients' heterogeneity in decision-making. For many chronic diseases including type 2 diabetes mellitus (T2D), treatments are usually multifaceted in the sense that aggressive treatments with a higher expected reward are also likely to elevate the risk of acute adverse events. In this paper, we propose a new weighted learning framework, namely benefit-risk dynamic treatment regimens (BR-DTRs), to address the benefit-risk trade-off. The new framework relies on a backward learning procedure by restricting the induced risk of the treatment rule to be no larger than a pre-specified risk constraint at each treatment stage. Computationally, the estimated treatment rule solves a weighted support vector machine problem with a modified smooth constraint. Theoretically, we show that the proposed DTRs are Fisher consistent, and we further obtain the convergence rates for both the value and risk functions. Finally, the performance of the proposed method is demonstrated via extensive simulation studies and application to a real study for T2D patients.

动态治疗方案(dtr)旨在定制个性化的顺序治疗规则,通过适应患者决策的异质性,使累积有益结果最大化。对于包括2型糖尿病(T2D)在内的许多慢性疾病,治疗通常是多方面的,因为具有较高预期回报的积极治疗也可能增加急性不良事件的风险。在本文中,我们提出了一个新的加权学习框架,即收益-风险动态治疗方案(BR-DTRs),以解决收益-风险权衡。新框架通过限制治疗规则的诱导风险不大于每个治疗阶段预先指定的风险约束,依赖于向后学习过程。在计算上,该估计处理规则解决了一个带有改进光滑约束的加权支持向量机问题。理论上,我们证明了所提出的dtr是Fisher一致的,并进一步得到了价值函数和风险函数的收敛速率。最后,通过广泛的模拟研究和应用于T2D患者的实际研究,证明了所提出方法的性能。
{"title":"Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls.","authors":"Mochuan Liu, Yuanjia Wang, Haoda Fu, Donglin Zeng","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Dynamic treatment regimens (DTRs) aim at tailoring individualized sequential treatment rules that maximize cumulative beneficial outcomes by accommodating patients' heterogeneity in decision-making. For many chronic diseases including type 2 diabetes mellitus (T2D), treatments are usually multifaceted in the sense that aggressive treatments with a higher expected reward are also likely to elevate the risk of acute adverse events. In this paper, we propose a new weighted learning framework, namely benefit-risk dynamic treatment regimens (BR-DTRs), to address the benefit-risk trade-off. The new framework relies on a backward learning procedure by restricting the induced risk of the treatment rule to be no larger than a pre-specified risk constraint at each treatment stage. Computationally, the estimated treatment rule solves a weighted support vector machine problem with a modified smooth constraint. Theoretically, we show that the proposed DTRs are Fisher consistent, and we further obtain the convergence rates for both the value and risk functions. Finally, the performance of the proposed method is demonstrated via extensive simulation studies and application to a real study for T2D patients.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12711320/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rethinking Discount Regularization: New Interpretations, Unintended Consequences, and Solutions for Regularization in Reinforcement Learning. 重新思考折扣正则化:强化学习中正则化的新解释、意外后果和解决方案。
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-01-01
Sarah Rathnam, Sonali Parbhoo, Siddharth Swaroop, Weiwei Pan, Susan A Murphy, Finale Doshi-Velez

Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to avoid overfitting when faced with sparse or noisy data. It is commonly interpreted as de-emphasizing or ignoring delayed effects. In this paper, we prove two alternative views of discount regularization that expose unintended consequences and motivate novel regularization methods. In model-based RL, planning under a lower discount factor acts like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. In model-free RL, discount regularization equates to planning using a weighted average Bellman update, where the agent plans as if the values of all state-action pairs are closer than implied by the data. Our equivalence theorems motivate simple methods that generalize discount regularization by setting parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific methods across empirical examples with both tabular and continuous state spaces.

折扣正则化在计算最优策略时使用更短的规划范围,是面对稀疏或有噪声数据时避免过拟合的常用选择。它通常被解释为不强调或忽略延迟效应。在本文中,我们证明了折扣正则化的两种替代观点,它们揭示了意想不到的后果并激发了新的正则化方法。在基于模型的强化学习中,较低折扣因子下的计划就像一个具有更强正则化的先验,对具有更多转移数据的状态-动作对。当从跨状态-动作对的数据量不均匀的数据集估计转移矩阵时,这会导致性能差。在无模型强化学习中,折扣正则化等同于使用加权平均Bellman更新进行计划,其中代理的计划就好像所有状态-动作对的值比数据暗示的值更接近。我们的等价定理激发了简单的方法,通过为单个状态-动作对局部设置参数而不是全局设置参数来推广折扣正则化。我们展示了折扣正则化的失败,以及我们如何在表格和连续状态空间的经验示例中使用我们的状态-动作特定方法来补救它们。
{"title":"Rethinking Discount Regularization: New Interpretations, Unintended Consequences, and Solutions for Regularization in Reinforcement Learning.","authors":"Sarah Rathnam, Sonali Parbhoo, Siddharth Swaroop, Weiwei Pan, Susan A Murphy, Finale Doshi-Velez","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to avoid overfitting when faced with sparse or noisy data. It is commonly interpreted as de-emphasizing or ignoring delayed effects. In this paper, we prove two alternative views of discount regularization that expose unintended consequences and motivate novel regularization methods. In model-based RL, planning under a lower discount factor acts like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. In model-free RL, discount regularization equates to planning using a weighted average Bellman update, where the agent plans as if the values of all state-action pairs are closer than implied by the data. Our equivalence theorems motivate simple methods that generalize discount regularization by setting parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific methods across empirical examples with both tabular and continuous state spaces.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12058221/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144056986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Batch Normalization Preconditioning for Stochastic Gradient Langevin Dynamics 随机梯度朗格万动力学的批归一化预处理
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.220726a
Susanne Lange, Wei Deng, Q. Ye, Guang Lin
{"title":"Batch Normalization Preconditioning for Stochastic Gradient Langevin Dynamics","authors":"Susanne Lange, Wei Deng, Q. Ye, Guang Lin","doi":"10.4208/jml.220726a","DOIUrl":"https://doi.org/10.4208/jml.220726a","url":null,"abstract":"","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"132 2 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76596604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization with NonIsolated Local Minima 具有非孤立局部极小值的非凸优化随机梯度下降法的局部收敛理论
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.230106
Taehee Ko and Xiantao Li
Non-convex loss functions arise frequently in modern machine learning, and for the theoretical analysis of stochastic optimization methods, the presence of non-isolated minima presents a unique challenge that has remained under-explored. In this paper, we study the local convergence of the stochastic gradient descent method to non-isolated global minima. Under mild assumptions, we estimate the probability for the iterations to stay near the minima by adopting the notion of stochastic stability. After establishing such stability, we present the lower bound complexity in terms of various error criteria for a given error tolerance ǫ and a failure probability γ .
{"title":"A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization with NonIsolated Local Minima","authors":"Taehee Ko and Xiantao Li","doi":"10.4208/jml.230106","DOIUrl":"https://doi.org/10.4208/jml.230106","url":null,"abstract":"Non-convex loss functions arise frequently in modern machine learning, and for the theoretical analysis of stochastic optimization methods, the presence of non-isolated minima presents a unique challenge that has remained under-explored. In this paper, we study the local convergence of the stochastic gradient descent method to non-isolated global minima. Under mild assumptions, we estimate the probability for the iterations to stay near the minima by adopting the notion of stochastic stability. After establishing such stability, we present the lower bound complexity in terms of various error criteria for a given error tolerance ǫ and a failure probability γ .","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135674517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Anti-Symmetrization of a Neural Network Layer by Taming the Sign Problem 基于驯服符号问题的神经网络层的有效抗对称
3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2023-06-01 DOI: 10.4208/jml.230703
Nilin Abrahamsen and Lin Lin
Explicit antisymmetrization of a neural network is a potential candidate for a universal function approximator for generic antisymmetric functions, which are ubiquitous in quantum physics. However, this procedure is a priori factorially costly to implement, making it impractical for large numbers of particles. The strategy also suffers from a sign problem. Namely, due to near-exact cancellation of positive and negative contributions, the magnitude of the antisymmetrized function may be significantly smaller than before anti-symmetrization. We show that the anti-symmetric projection of a two-layer neural network can be evaluated efficiently, opening the door to using a generic antisymmetric layer as a building block in anti-symmetric neural network Ansatzes. This approximation is effective when the sign problem is controlled, and we show that this property depends crucially the choice of activation function under standard Xavier/He initialization methods. As a consequence, using a smooth activation function requires re-scaling of the neural network weights compared to standard initializations.
{"title":"Efficient Anti-Symmetrization of a Neural Network Layer by Taming the Sign Problem","authors":"Nilin Abrahamsen and Lin Lin","doi":"10.4208/jml.230703","DOIUrl":"https://doi.org/10.4208/jml.230703","url":null,"abstract":"Explicit antisymmetrization of a neural network is a potential candidate for a universal function approximator for generic antisymmetric functions, which are ubiquitous in quantum physics. However, this procedure is a priori factorially costly to implement, making it impractical for large numbers of particles. The strategy also suffers from a sign problem. Namely, due to near-exact cancellation of positive and negative contributions, the magnitude of the antisymmetrized function may be significantly smaller than before anti-symmetrization. We show that the anti-symmetric projection of a two-layer neural network can be evaluated efficiently, opening the door to using a generic antisymmetric layer as a building block in anti-symmetric neural network Ansatzes. This approximation is effective when the sign problem is controlled, and we show that this property depends crucially the choice of activation function under standard Xavier/He initialization methods. As a consequence, using a smooth activation function requires re-scaling of the neural network weights compared to standard initializations.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135144017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Machine Learning Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1