Rui Dong, Gao T Wang, Andrew T DeWan, Suzanne M Leal
{"title":"仅用例设计是一种检测交互的强大方法,但应谨慎使用。","authors":"Rui Dong, Gao T Wang, Andrew T DeWan, Suzanne M Leal","doi":"10.1186/s12864-025-11318-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The case-only design is a powerful approach to identify gene <math><mo>×</mo></math> gene and gene <math><mo>×</mo></math> environment interactions for complex traits. It has been demonstrated that for the case-only design to be valid the genetic and environmental factors must be independent in the population. Additionally, there is a rare disease assumption for the case-only design, but the impact of disease prevalence and other factors, e.g., size of main effects, on type I and II error rates has not been investigated.</p><p><strong>Methods: </strong>Through theoretical and extensive simulation studies, we investigated type I error, power, and bias of interaction term for a wide variety of disease prevalences, main and interaction effect sizes, sample sizes, and variant and environmental exposure frequencies.</p><p><strong>Results: </strong>For diseases with prevalence <math><mo><</mo></math> 4%, the case-only design usually has well controlled type I error rates and is substantially more powerful to detect interactions than the case-control design, but for higher disease prevalences both type I and II error rates can be inflated and the estimate of interaction term biased. However, when one or both main effects are large there can be inflated type I error rate even for low disease prevalences, e.g., <math><mo><</mo></math> 1%, but if there is no or only one main effect, type I error rate is controlled regardless of the disease prevalence. Additionally, type I error rate can increase with sample size.</p><p><strong>Conclusions: </strong>We determined the upper bound of the disease prevalence in order not to violate the rare disease assumption for the case-only design. To verify that a case-only design study does not have increased type I error rate, the bias of the interaction term should be estimated. Although the case-only design is a powerful method to detect interactions, prevalences for some complex traits are too high to implement this method without increasing type I error rates.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"26 1","pages":"222"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11884093/pdf/","citationCount":"0","resultStr":"{\"title\":\"The case-only design is a powerful approach to detect interactions but should be used with caution.\",\"authors\":\"Rui Dong, Gao T Wang, Andrew T DeWan, Suzanne M Leal\",\"doi\":\"10.1186/s12864-025-11318-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The case-only design is a powerful approach to identify gene <math><mo>×</mo></math> gene and gene <math><mo>×</mo></math> environment interactions for complex traits. It has been demonstrated that for the case-only design to be valid the genetic and environmental factors must be independent in the population. Additionally, there is a rare disease assumption for the case-only design, but the impact of disease prevalence and other factors, e.g., size of main effects, on type I and II error rates has not been investigated.</p><p><strong>Methods: </strong>Through theoretical and extensive simulation studies, we investigated type I error, power, and bias of interaction term for a wide variety of disease prevalences, main and interaction effect sizes, sample sizes, and variant and environmental exposure frequencies.</p><p><strong>Results: </strong>For diseases with prevalence <math><mo><</mo></math> 4%, the case-only design usually has well controlled type I error rates and is substantially more powerful to detect interactions than the case-control design, but for higher disease prevalences both type I and II error rates can be inflated and the estimate of interaction term biased. However, when one or both main effects are large there can be inflated type I error rate even for low disease prevalences, e.g., <math><mo><</mo></math> 1%, but if there is no or only one main effect, type I error rate is controlled regardless of the disease prevalence. Additionally, type I error rate can increase with sample size.</p><p><strong>Conclusions: </strong>We determined the upper bound of the disease prevalence in order not to violate the rare disease assumption for the case-only design. To verify that a case-only design study does not have increased type I error rate, the bias of the interaction term should be estimated. Although the case-only design is a powerful method to detect interactions, prevalences for some complex traits are too high to implement this method without increasing type I error rates.</p>\",\"PeriodicalId\":9030,\"journal\":{\"name\":\"BMC Genomics\",\"volume\":\"26 1\",\"pages\":\"222\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11884093/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12864-025-11318-1\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12864-025-11318-1","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
The case-only design is a powerful approach to detect interactions but should be used with caution.
Background: The case-only design is a powerful approach to identify gene gene and gene environment interactions for complex traits. It has been demonstrated that for the case-only design to be valid the genetic and environmental factors must be independent in the population. Additionally, there is a rare disease assumption for the case-only design, but the impact of disease prevalence and other factors, e.g., size of main effects, on type I and II error rates has not been investigated.
Methods: Through theoretical and extensive simulation studies, we investigated type I error, power, and bias of interaction term for a wide variety of disease prevalences, main and interaction effect sizes, sample sizes, and variant and environmental exposure frequencies.
Results: For diseases with prevalence 4%, the case-only design usually has well controlled type I error rates and is substantially more powerful to detect interactions than the case-control design, but for higher disease prevalences both type I and II error rates can be inflated and the estimate of interaction term biased. However, when one or both main effects are large there can be inflated type I error rate even for low disease prevalences, e.g., 1%, but if there is no or only one main effect, type I error rate is controlled regardless of the disease prevalence. Additionally, type I error rate can increase with sample size.
Conclusions: We determined the upper bound of the disease prevalence in order not to violate the rare disease assumption for the case-only design. To verify that a case-only design study does not have increased type I error rate, the bias of the interaction term should be estimated. Although the case-only design is a powerful method to detect interactions, prevalences for some complex traits are too high to implement this method without increasing type I error rates.
期刊介绍:
BMC Genomics is an open access, peer-reviewed journal that considers articles on all aspects of genome-scale analysis, functional genomics, and proteomics.
BMC Genomics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.