{"title":"A computational model for sample dependence in hypothesis testing of genome data","authors":"Sunhee Kim, Chang-Yong Lee","doi":"10.1007/s40042-024-01100-z","DOIUrl":null,"url":null,"abstract":"<div><p>Statistical hypothesis testing assumes that the samples being analyzed are statistically independent, meaning that the occurrence of one sample does not affect the probability of the occurrence of another. In reality, however, this assumption may not always hold. When samples are not independent, it is important to consider their interdependence when interpreting the results of the hypothesis test. In this study, we address the issue of sample dependence in hypothesis testing by introducing the concept of adjusted sample size. This adjusted sample size provides additional information about the test results, which is particularly useful when samples exhibit dependence. To determine the adjusted sample size, we use the theory of networks to quantify sample dependence and model the variance of network density as a function of sample size. Our approach involves estimating the adjusted sample size by analyzing the variance of the network density, which reflects the degree of sample dependence. Through simulations, we demonstrate that dependent samples yield a higher variance in network density compared to independent samples, validating our method for estimating the adjusted sample size. Furthermore, we apply our proposed method to genomic datasets, estimating the adjusted sample size to effectively account for sample dependence in hypothesis testing. This guides interpreting test results and ensures more accurate data analysis.</p></div>","PeriodicalId":677,"journal":{"name":"Journal of the Korean Physical Society","volume":"85 2","pages":"201 - 211"},"PeriodicalIF":0.8000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Korean Physical Society","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1007/s40042-024-01100-z","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Statistical hypothesis testing assumes that the samples being analyzed are statistically independent, meaning that the occurrence of one sample does not affect the probability of the occurrence of another. In reality, however, this assumption may not always hold. When samples are not independent, it is important to consider their interdependence when interpreting the results of the hypothesis test. In this study, we address the issue of sample dependence in hypothesis testing by introducing the concept of adjusted sample size. This adjusted sample size provides additional information about the test results, which is particularly useful when samples exhibit dependence. To determine the adjusted sample size, we use the theory of networks to quantify sample dependence and model the variance of network density as a function of sample size. Our approach involves estimating the adjusted sample size by analyzing the variance of the network density, which reflects the degree of sample dependence. Through simulations, we demonstrate that dependent samples yield a higher variance in network density compared to independent samples, validating our method for estimating the adjusted sample size. Furthermore, we apply our proposed method to genomic datasets, estimating the adjusted sample size to effectively account for sample dependence in hypothesis testing. This guides interpreting test results and ensures more accurate data analysis.
期刊介绍:
The Journal of the Korean Physical Society (JKPS) covers all fields of physics spanning from statistical physics and condensed matter physics to particle physics. The manuscript to be published in JKPS is required to hold the originality, significance, and recent completeness. The journal is composed of Full paper, Letters, and Brief sections. In addition, featured articles with outstanding results are selected by the Editorial board and introduced in the online version. For emphasis on aspect of international journal, several world-distinguished researchers join the Editorial board. High quality of papers may be express-published when it is recommended or requested.