On Selecting and Conditioning in Multiple Testing and Selective Inference

IF 2.4 2区 数学 Q2 BIOLOGY Biometrika Pub Date : 2023-12-22 DOI:10.1093/biomet/asad078
Jelle J Goeman, Aldo Solari
{"title":"On Selecting and Conditioning in Multiple Testing and Selective Inference","authors":"Jelle J Goeman, Aldo Solari","doi":"10.1093/biomet/asad078","DOIUrl":null,"url":null,"abstract":"We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting, as well as modern data carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this paper, we adopt a holistic view on such methods, considering the selection, conditioning, and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We give general theory and intuitions before investigating in detail several case studies where a shift to a non-selective or unconditional perspective can yield a power gain.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asad078","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting, as well as modern data carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this paper, we adopt a holistic view on such methods, considering the selection, conditioning, and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We give general theory and intuitions before investigating in detail several case studies where a shift to a non-selective or unconditional perspective can yield a power gain.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
论多重测试和选择性推理中的选择和条件限制
我们研究了一类以选择事件为条件的选择性推理方法。这类方法分为两个阶段。首先,从大量假设中选择一个数据驱动的假设集合。随后,在这个数据驱动的集合中,以用于选择的信息为条件进行推理。这类方法的例子包括基本的数据分割、现代的数据雕刻方法和基于多面体阶梯的套索系数选择后推理方法。在本文中,我们对此类方法采用了整体观点,将选择、调节和最终误差控制步骤视为一个方法。从这个角度出发,我们证明了直接定义于全部假设的多重检验方法总是至少与基于选择和条件的选择性推理方法一样强大。即使假设的范围可能是无限的,而且只是隐含定义的,例如在数据分割的情况下,这一结果也是成立的。我们先给出了一般理论和直觉,然后详细研究了几个案例,在这些案例中,转向非选择性或无条件视角可以获得更强的推理能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
期刊最新文献
Local Bootstrap for Network Data A Simple Bootstrap for Chatterjee's Rank Correlation Sensitivity models and bounds under sequential unmeasured confounding in longitudinal studies Studies in the history of probability and statistics, LI: the first conditional logistic regression Skip-sampling: subsampling in the frequency domain
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1