稀疏高阶交互模型置信机

IF 0.7 4区 数学 Q3 STATISTICS & PROBABILITY Stat Pub Date : 2024-02-05 DOI:10.1002/sta4.633
Diptesh Das, Eugene Ndiaye, Ichiro Takeuchi
{"title":"稀疏高阶交互模型置信机","authors":"Diptesh Das, Eugene Ndiaye, Ichiro Takeuchi","doi":"10.1002/sta4.633","DOIUrl":null,"url":null,"abstract":"In predictive modelling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the coverage of prediction results with fewer theoretical assumptions. To obtain the prediction set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible for simple predictors. For complex predictors such as random forests (RFs) or neural networks (NNs), split-CP is often employed where the data is split into two parts: one part for fitting and another for computing the prediction set. Unfortunately, because of the reduced sample size, split-CP is inferior to full-CP both in fitting as well as prediction set computation. In this paper, we develop a full-CP of sparse high-order interaction model (SHIM), which is sufficiently flexible as it can take into account high-order interactions among variables. We resolve the computational challenge for full-CP of SHIM by introducing a novel approach called homotopy mining. Through numerical experiments, we demonstrate that SHIM is as accurate as complex predictors such as RF and NN and enjoys the superior statistical power of full-CP.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"35 1","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A confidence machine for sparse high-order interaction model\",\"authors\":\"Diptesh Das, Eugene Ndiaye, Ichiro Takeuchi\",\"doi\":\"10.1002/sta4.633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In predictive modelling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the coverage of prediction results with fewer theoretical assumptions. To obtain the prediction set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible for simple predictors. For complex predictors such as random forests (RFs) or neural networks (NNs), split-CP is often employed where the data is split into two parts: one part for fitting and another for computing the prediction set. Unfortunately, because of the reduced sample size, split-CP is inferior to full-CP both in fitting as well as prediction set computation. In this paper, we develop a full-CP of sparse high-order interaction model (SHIM), which is sufficiently flexible as it can take into account high-order interactions among variables. We resolve the computational challenge for full-CP of SHIM by introducing a novel approach called homotopy mining. Through numerical experiments, we demonstrate that SHIM is as accurate as complex predictors such as RF and NN and enjoys the superior statistical power of full-CP.\",\"PeriodicalId\":56159,\"journal\":{\"name\":\"Stat\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2024-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Stat\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1002/sta4.633\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stat","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/sta4.633","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

摘要

在用于高风险决策的预测建模中,预测器不仅要准确,而且要可靠。共形预测(CP)是一种以较少理论假设获得预测结果覆盖面的有前途的方法。要通过所谓的全共形预测(full-CP)获得预测集,我们需要针对预测结果的所有可能值重新拟合预测器,而这只适用于简单的预测器。对于随机森林(RF)或神经网络(NN)等复杂预测器,通常采用拆分式 CP,即将数据拆分为两部分:一部分用于拟合,另一部分用于计算预测集。遗憾的是,由于样本量减少,拆分式 CP 在拟合和预测集计算方面都不如完全式 CP。在本文中,我们开发了一种稀疏高阶交互模型(SHIM)的全CP,它具有足够的灵活性,可以考虑变量间的高阶交互作用。我们通过引入一种名为同调挖掘(homotopy mining)的新方法,解决了 SHIM 全 CP 的计算难题。通过数值实验,我们证明了 SHIM 与 RF 和 NN 等复杂预测器一样准确,并具有全 CP 的卓越统计能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A confidence machine for sparse high-order interaction model
In predictive modelling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the coverage of prediction results with fewer theoretical assumptions. To obtain the prediction set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible for simple predictors. For complex predictors such as random forests (RFs) or neural networks (NNs), split-CP is often employed where the data is split into two parts: one part for fitting and another for computing the prediction set. Unfortunately, because of the reduced sample size, split-CP is inferior to full-CP both in fitting as well as prediction set computation. In this paper, we develop a full-CP of sparse high-order interaction model (SHIM), which is sufficiently flexible as it can take into account high-order interactions among variables. We resolve the computational challenge for full-CP of SHIM by introducing a novel approach called homotopy mining. Through numerical experiments, we demonstrate that SHIM is as accurate as complex predictors such as RF and NN and enjoys the superior statistical power of full-CP.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Stat
Stat Decision Sciences-Statistics, Probability and Uncertainty
CiteScore
1.10
自引率
0.00%
发文量
85
期刊介绍: Stat is an innovative electronic journal for the rapid publication of novel and topical research results, publishing compact articles of the highest quality in all areas of statistical endeavour. Its purpose is to provide a means of rapid sharing of important new theoretical, methodological and applied research. Stat is a joint venture between the International Statistical Institute and Wiley-Blackwell. Stat is characterised by: • Speed - a high-quality review process that aims to reach a decision within 20 days of submission. • Concision - a maximum article length of 10 pages of text, not including references. • Supporting materials - inclusion of electronic supporting materials including graphs, video, software, data and images. • Scope - addresses all areas of statistics and interdisciplinary areas. Stat is a scientific journal for the international community of statisticians and researchers and practitioners in allied quantitative disciplines.
期刊最新文献
Communication‐Efficient Distributed Estimation of Causal Effects With High‐Dimensional Data A Joint Temporal Model for Hospitalizations and ICU Admissions Due to COVID‐19 in Quebec Bitcoin Price Prediction Using Deep Bayesian LSTM With Uncertainty Quantification: A Monte Carlo Dropout–Based Approach Exact interval estimation for three parameters subject to false positive misclassification Novel Closed‐Form Point Estimators for a Weighted Exponential Family Derived From Likelihood Equations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1