Safe Testing

P. Grünwald, R. D. Heide, Wouter M. Koolen
{"title":"Safe Testing","authors":"P. Grünwald, R. D. Heide, Wouter M. Koolen","doi":"10.1109/ITA50056.2020.9244948","DOIUrl":null,"url":null,"abstract":"We present a new theory of hypothesis testing. The main concept is the s-value, a notion of evidence which, unlike p-values, allows for effortlessly combining evidence from several tests, even in the common scenario where the decision to perform a new test depends on the previous test outcome: safe tests based on s-values generally preserve Type-I error guarantees under such ‘optional continuation’. S-values exist for completely general testing problems with composite null and alternatives. Their prime interpretation is in terms of gambling or investing, each S-value corresponding to a particular investment. Surprisingly, optimal \"GROW\" S-values, which lead to fastest capital growth, are fully characterized by the joint information projection (JIPr) between the set of all Bayes marginal distributions on ${\\mathcal{H}_0}$ and ${\\mathcal{H}_1}$. Thus, optimal s-values also have an interpretation as Bayes factors, with priors given by the JIPr. We illustrate the theory using two classical testing scenarios: the one-sample t-test and the 2 × 2-contingency table. In the t-test setting, GROW S-values correspond to adopting the right Haar prior on the variance, like in Jeffreys’ Bayesian t-test. However, unlike Jeffreys’, the default safe t-test puts a discrete 2-point prior on the effect size, leading to better behaviour in terms of statistical power. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, S-values and safe tests may provide a methodology acceptable to adherents of all three schools.","PeriodicalId":137257,"journal":{"name":"2020 Information Theory and Applications Workshop (ITA)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"140","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Information Theory and Applications Workshop (ITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA50056.2020.9244948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 140

Abstract

We present a new theory of hypothesis testing. The main concept is the s-value, a notion of evidence which, unlike p-values, allows for effortlessly combining evidence from several tests, even in the common scenario where the decision to perform a new test depends on the previous test outcome: safe tests based on s-values generally preserve Type-I error guarantees under such ‘optional continuation’. S-values exist for completely general testing problems with composite null and alternatives. Their prime interpretation is in terms of gambling or investing, each S-value corresponding to a particular investment. Surprisingly, optimal "GROW" S-values, which lead to fastest capital growth, are fully characterized by the joint information projection (JIPr) between the set of all Bayes marginal distributions on ${\mathcal{H}_0}$ and ${\mathcal{H}_1}$. Thus, optimal s-values also have an interpretation as Bayes factors, with priors given by the JIPr. We illustrate the theory using two classical testing scenarios: the one-sample t-test and the 2 × 2-contingency table. In the t-test setting, GROW S-values correspond to adopting the right Haar prior on the variance, like in Jeffreys’ Bayesian t-test. However, unlike Jeffreys’, the default safe t-test puts a discrete 2-point prior on the effect size, leading to better behaviour in terms of statistical power. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, S-values and safe tests may provide a methodology acceptable to adherents of all three schools.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
安全测试
我们提出了一种新的假设检验理论。主要概念是s值,这是一个证据的概念,与p值不同,它允许毫不费力地组合来自多个测试的证据,即使在执行新测试的决定取决于先前测试结果的常见场景中也是如此:基于s值的安全测试通常在这种“可选延续”下保留i型错误保证。s值存在于具有复合空值和替代值的完全通用测试问题中。它们的主要解释是赌博或投资,每个s值对应一个特定的投资。令人惊讶的是,导致资本增长最快的最优“GROW”s值完全由${\mathcal{H}_0}$和${\mathcal{H}_1}$上的所有贝叶斯边际分布集之间的联合信息投影(joint information projection, JIPr)来表征。因此,最优s值也可以解释为贝叶斯因子,其先验由JIPr给出。我们用两个经典的检验场景来说明这个理论:单样本t检验和2 × 2列联表。在t检验设置中,GROW s值对应于对方差采用正确的Haar先验,如Jeffreys的贝叶斯t检验。然而,与Jeffreys不同的是,默认的安全t检验在效应大小上放置了离散的2点先验,从而在统计能力方面导致了更好的行为。分享fisher, Neymanian和Jeffreys-Bayesian的解释,s值和安全测试可能会为这三个学派的追随者提供一种可接受的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massive MIMO is Very Useful for Pilot-Free Uplink Communications Simplified Ray Tracing for the Millimeter Wave Channel: A Performance Evaluation On Marton's Achievable Region: Local Tensorization for Product Channels with a Binary Component Improve Robustness of Deep Neural Networks by Coding On Nonnegative CP Tensor Decomposition Robustness to Noise
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1