Linked shrinkage to improve estimation of interaction effects in regression models.

Q3 Mathematics Epidemiologic Methods Pub Date : 2024-07-09 eCollection Date: 2024-01-01 DOI:10.1515/em-2023-0039
Mark A van de Wiel, Matteo Amestoy, Jeroen Hoogland
{"title":"Linked shrinkage to improve estimation of interaction effects in regression models.","authors":"Mark A van de Wiel, Matteo Amestoy, Jeroen Hoogland","doi":"10.1515/em-2023-0039","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The addition of two-way interactions is a classic problem in statistics, and comes with the challenge of quadratically increasing dimension. We aim to a) devise an estimation method that can handle this challenge and b) to aid interpretation of the resulting model by developing computational tools for quantifying variable importance.</p><p><strong>Methods: </strong>Existing strategies typically overcome the dimensionality problem by only allowing interactions between relevant main effects. Building on this philosophy, and aiming for settings with moderate n to p ratio, we develop a local shrinkage model that links the shrinkage of interaction effects to the shrinkage of their corresponding main effects. In addition, we derive a new analytical formula for the Shapley value, which allows rapid assessment of individual-specific variable importance scores and their uncertainties.</p><p><strong>Results: </strong>We empirically demonstrate that our approach provides accurate estimates of the model parameters and very competitive predictive accuracy. In our Bayesian framework, estimation inherently comes with inference, which facilitates variable selection. Comparisons with key competitors are provided. Large-scale cohort data are used to provide realistic illustrations and evaluations. The implementation of our method in RStan is relatively straightforward and flexible, allowing for adaptation to specific needs.</p><p><strong>Conclusions: </strong>Our method is an attractive alternative for existing strategies to handle interactions in epidemiological and/or clinical studies, as its linked local shrinkage can improve parameter accuracy, prediction and variable selection. Moreover, it provides appropriate inference and interpretation, and may compete well with less interpretable machine learners in terms of prediction.</p>","PeriodicalId":37999,"journal":{"name":"Epidemiologic Methods","volume":"13 1","pages":"20230039"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232106/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/em-2023-0039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: The addition of two-way interactions is a classic problem in statistics, and comes with the challenge of quadratically increasing dimension. We aim to a) devise an estimation method that can handle this challenge and b) to aid interpretation of the resulting model by developing computational tools for quantifying variable importance.

Methods: Existing strategies typically overcome the dimensionality problem by only allowing interactions between relevant main effects. Building on this philosophy, and aiming for settings with moderate n to p ratio, we develop a local shrinkage model that links the shrinkage of interaction effects to the shrinkage of their corresponding main effects. In addition, we derive a new analytical formula for the Shapley value, which allows rapid assessment of individual-specific variable importance scores and their uncertainties.

Results: We empirically demonstrate that our approach provides accurate estimates of the model parameters and very competitive predictive accuracy. In our Bayesian framework, estimation inherently comes with inference, which facilitates variable selection. Comparisons with key competitors are provided. Large-scale cohort data are used to provide realistic illustrations and evaluations. The implementation of our method in RStan is relatively straightforward and flexible, allowing for adaptation to specific needs.

Conclusions: Our method is an attractive alternative for existing strategies to handle interactions in epidemiological and/or clinical studies, as its linked local shrinkage can improve parameter accuracy, prediction and variable selection. Moreover, it provides appropriate inference and interpretation, and may compete well with less interpretable machine learners in terms of prediction.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关联收缩,改进回归模型中交互效应的估计。
目标增加双向交互作用是统计学中的一个经典问题,同时也带来了维度二次增大的挑战。我们的目标是:a) 设计出一种能应对这一挑战的估计方法;b) 通过开发量化变量重要性的计算工具,帮助解释所得到的模型:方法:现有的策略通常通过只允许相关主效应之间的交互作用来克服维度问题。基于这一理念,我们开发了一种局部收缩模型,将交互效应的收缩与相应主效应的收缩联系起来。此外,我们还为夏普利值推导了一个新的分析公式,从而可以快速评估特定个体变量的重要性得分及其不确定性:结果:我们通过经验证明,我们的方法可以提供准确的模型参数估计和极具竞争力的预测准确性。在我们的贝叶斯框架中,估计本身就包含推理,这有助于变量选择。我们还提供了与主要竞争对手的比较。大规模队列数据用于提供现实的说明和评估。我们的方法在 RStan 中的实现相对简单、灵活,可以适应特定需求:我们的方法是流行病学和/或临床研究中处理交互作用的现有策略的一种有吸引力的替代方法,因为其关联的局部收缩可以提高参数的准确性、预测和变量选择。此外,它还能提供适当的推断和解释,在预测方面可以与解释能力较弱的机器学习器竞争。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Epidemiologic Methods
Epidemiologic Methods Mathematics-Applied Mathematics
CiteScore
2.10
自引率
0.00%
发文量
7
期刊介绍: Epidemiologic Methods (EM) seeks contributions comparable to those of the leading epidemiologic journals, but also invites papers that may be more technical or of greater length than what has traditionally been allowed by journals in epidemiology. Applications and examples with real data to illustrate methodology are strongly encouraged but not required. Topics. genetic epidemiology, infectious disease, pharmaco-epidemiology, ecologic studies, environmental exposures, screening, surveillance, social networks, comparative effectiveness, statistical modeling, causal inference, measurement error, study design, meta-analysis
期刊最新文献
Linked shrinkage to improve estimation of interaction effects in regression models. Bounds for selection bias using outcome probabilities Population dynamic study of two prey one predator system with disease in first prey using fuzzy impulsive control Development and application of an evidence-based directed acyclic graph to evaluate the associations between metal mixtures and cardiometabolic outcomes. Performance evaluation of ResNet model for classification of tomato plant disease
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1