p值精密度和再现性。

IF 1.8 4区数学 Q1 STATISTICS & PROBABILITY American Statistician Pub Date : 2011-01-01 Epub Date: 2012-01-24 DOI:10.1198/tas.2011.10129

Dennis D Boos, Leonard A Stefanski

{"title":"p值精密度和再现性。","authors":"Dennis D Boos, Leonard A Stefanski","doi":"10.1198/tas.2011.10129","DOIUrl":null,"url":null,"abstract":"P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels .05, .01, and .001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"65 4","pages":"213-221"},"PeriodicalIF":1.8000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1198/tas.2011.10129","citationCount":"178","resultStr":"{\"title\":\"P-Value Precision and Reproducibility.\",\"authors\":\"Dennis D Boos, Leonard A Stefanski\",\"doi\":\"10.1198/tas.2011.10129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels .05, .01, and .001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.\",\"PeriodicalId\":50801,\"journal\":{\"name\":\"American Statistician\",\"volume\":\"65 4\",\"pages\":\"213-221\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2011-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1198/tas.2011.10129\",\"citationCount\":\"178\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Statistician\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1198/tas.2011.10129\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2012/1/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Statistician","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1198/tas.2011.10129","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/1/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 178

摘要

p值是有用的反零假设证据的统计度量。然而，与其他统计估计相比，它们的样本间变异性通常没有被考虑或估计，因此没有得到充分的认识。通过对对数尺度p值标准误差、自举预测界限和未来重复p值的可重复性概率的系统研究，我们表明p值在典型数据情况下表现出惊人的大变异性。除了为统计结果无法复制的讨论提供背景之外，我们的研究结果还揭示了精确p值与近似p值的相对价值，并表明使用*、**和***来表示主题期刊中统计显著性的0.05、0.01和0.001水平，根据广泛接受的四舍五入统计估计规则判断，这是报告p值的正确精度水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

P-Value Precision and Reproducibility.

P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels .05, .01, and .001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

American Statistician 数学-统计学与概率论

CiteScore

3.50

自引率

5.60%

发文量

审稿时长

>12 weeks

期刊介绍： Are you looking for general-interest articles about current national and international statistical problems and programs; interesting and fun articles of a general nature about statistics and its applications; or the teaching of statistics? Then you are looking for The American Statistician (TAS), published quarterly by the American Statistical Association. TAS contains timely articles organized into the following sections: Statistical Practice, General, Teacher''s Corner, History Corner, Interdisciplinary, Statistical Computing and Graphics, Reviews of Books and Teaching Materials, and Letters to the Editor.

期刊最新文献

Causal Inference with Complex Surveys: A Unified Perspective on Sample Selection and Exposure Selection Performance Analysis of NSUM Estimators in Social-Network Topologies Cross-validatory Z-Residual for Diagnosing Shared Frailty Models A Pareto tail plot without moment restrictions Sparse-group boosting: Unbiased group and variable selection