{"title":"假设检验的连续一般化","authors":"Nick W. Koning","doi":"arxiv-2409.05654","DOIUrl":null,"url":null,"abstract":"Testing has developed into the fundamental statistical framework for\nfalsifying hypotheses. Unfortunately, tests are binary in nature: a test either\nrejects a hypothesis or not. Such binary decisions do not reflect the reality\nof many scientific studies, which often aim to present the evidence against a\nhypothesis and do not necessarily intend to establish a definitive conclusion.\nTo solve this, we propose the continuous generalization of a test, which we use\nto measure the evidence against a hypothesis. Such a continuous test can be\ninterpreted as a non-randomized interpretation of the classical 'randomized\ntest'. This offers the benefits of a randomized test, without the downsides of\nexternal randomization. Another interpretation is as a literal measure, which\nmeasures the amount of binary tests that reject the hypothesis. Our work also\noffers a new perspective on the $e$-value: the $e$-value is recovered as a\ncontinuous test with $\\alpha \\to 0$, or as an unbounded measure of the amount\nof rejections.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Continuous Generalization of Hypothesis Testing\",\"authors\":\"Nick W. Koning\",\"doi\":\"arxiv-2409.05654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Testing has developed into the fundamental statistical framework for\\nfalsifying hypotheses. Unfortunately, tests are binary in nature: a test either\\nrejects a hypothesis or not. Such binary decisions do not reflect the reality\\nof many scientific studies, which often aim to present the evidence against a\\nhypothesis and do not necessarily intend to establish a definitive conclusion.\\nTo solve this, we propose the continuous generalization of a test, which we use\\nto measure the evidence against a hypothesis. Such a continuous test can be\\ninterpreted as a non-randomized interpretation of the classical 'randomized\\ntest'. This offers the benefits of a randomized test, without the downsides of\\nexternal randomization. Another interpretation is as a literal measure, which\\nmeasures the amount of binary tests that reject the hypothesis. Our work also\\noffers a new perspective on the $e$-value: the $e$-value is recovered as a\\ncontinuous test with $\\\\alpha \\\\to 0$, or as an unbounded measure of the amount\\nof rejections.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05654\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Testing has developed into the fundamental statistical framework for
falsifying hypotheses. Unfortunately, tests are binary in nature: a test either
rejects a hypothesis or not. Such binary decisions do not reflect the reality
of many scientific studies, which often aim to present the evidence against a
hypothesis and do not necessarily intend to establish a definitive conclusion.
To solve this, we propose the continuous generalization of a test, which we use
to measure the evidence against a hypothesis. Such a continuous test can be
interpreted as a non-randomized interpretation of the classical 'randomized
test'. This offers the benefits of a randomized test, without the downsides of
external randomization. Another interpretation is as a literal measure, which
measures the amount of binary tests that reject the hypothesis. Our work also
offers a new perspective on the $e$-value: the $e$-value is recovered as a
continuous test with $\alpha \to 0$, or as an unbounded measure of the amount
of rejections.