Beyond Neyman-Pearson: E-values enable hypothesis testing with a data-driven alpha.

IF 9.4 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Proceedings of the National Academy of Sciences of the United States of America Pub Date : 2024-09-20 DOI:10.1073/pnas.2302098121
Peter D Grünwald
{"title":"Beyond Neyman-Pearson: E-values enable hypothesis testing with a data-driven alpha.","authors":"Peter D Grünwald","doi":"10.1073/pnas.2302098121","DOIUrl":null,"url":null,"abstract":"A standard practice in statistical hypothesis testing is to mention the P-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With P-values, it is not clear how to use an extreme observation (e.g. [Formula: see text]) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post hoc, after observation of the data-thereby providing a handle on \"roving [Formula: see text]'s.\" When Type-II risks are taken into consideration, the only admissible decision rules in the post hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail, whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.","PeriodicalId":20548,"journal":{"name":"Proceedings of the National Academy of Sciences of the United States of America","volume":null,"pages":null},"PeriodicalIF":9.4000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the National Academy of Sciences of the United States of America","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1073/pnas.2302098121","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

A standard practice in statistical hypothesis testing is to mention the P-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With P-values, it is not clear how to use an extreme observation (e.g. [Formula: see text]) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post hoc, after observation of the data-thereby providing a handle on "roving [Formula: see text]'s." When Type-II risks are taken into consideration, the only admissible decision rules in the post hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail, whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
超越奈曼-皮尔逊:E 值可通过数据驱动的阿尔法进行假设检验。
统计假设检验的标准做法是在做出接受/拒绝决定的同时提及 P 值。我们将展示提及 e 值的优势。对于 P 值,如何使用极端观测值(如[公式:见正文])来获得更好的频数决策并不清楚。而使用 e 值则简单明了,因为 e 值在广义的奈曼-皮尔逊(Neyman-Pearson)设置中提供了第一类风险控制,其决策任务(一般损失函数)是在观察数据后临时确定的,因此提供了对 "巡回[公式:见正文]"的处理方法。当考虑到第二类风险时,事后设置中唯一可接受的决策规则就变成了基于电子值的决策规则。同样,如果指定一个错误的置信区间所造成的损失没有预先确定,那么标准置信区间和分布可能会失效,而电子置信集和电子阶后值仍能提供有效的风险保证。现在,我们已经为一系列经典测试问题开发出了足够强大的电子值。我们将讨论更广泛的开发和应用所面临的主要挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
19.00
自引率
0.90%
发文量
3575
审稿时长
2.5 months
期刊介绍: The Proceedings of the National Academy of Sciences (PNAS), a peer-reviewed journal of the National Academy of Sciences (NAS), serves as an authoritative source for high-impact, original research across the biological, physical, and social sciences. With a global scope, the journal welcomes submissions from researchers worldwide, making it an inclusive platform for advancing scientific knowledge.
期刊最新文献
Current usage of sounding rockets to study the upper atmosphere. QnAs with Scott E. Heatwole and Robert F. Pfaff. Correction for Lu et al., In situ electrogenerated Cu(III) triggers hydroxyl radical production on the Cu-Sb-SnO2 electrode for highly efficient water decontamination. Beyond Neyman-Pearson: E-values enable hypothesis testing with a data-driven alpha. Enhanced effects of species richness on resistance and resilience of global tree growth to prolonged drought.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1