Yi-Juan Hu, Amand F Schmidt, Frank Dudbridge, Michael V Holmes, James M Brophy, Vinicius Tragante, Ziyi Li, Peizhou Liao, Arshed A Quyyumi, Raymond O McCubrey, Benjamin D Horne, Aroon D Hingorani, Folkert W Asselbergs, Riyaz S Patel, Qi Long
{"title":"Impact of Selection Bias on Estimation of Subsequent Event Risk.","authors":"Yi-Juan Hu, Amand F Schmidt, Frank Dudbridge, Michael V Holmes, James M Brophy, Vinicius Tragante, Ziyi Li, Peizhou Liao, Arshed A Quyyumi, Raymond O McCubrey, Benjamin D Horne, Aroon D Hingorani, Folkert W Asselbergs, Riyaz S Patel, Qi Long","doi":"10.1161/CIRCGENETICS.116.001616","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Studies of recurrent or subsequent disease events may be susceptible to bias caused by selection of subjects who both experience and survive the primary indexing event. Currently, the magnitude of any selection bias, particularly for subsequent time-to-event analysis in genetic association studies, is unknown.</p><p><strong>Methods and results: </strong>We used empirically inspired simulation studies to explore the impact of selection bias on the marginal hazard ratio for risk of subsequent events among those with established coronary heart disease. The extent of selection bias was determined by the magnitudes of genetic and nongenetic effects on the indexing (first) coronary heart disease event. Unless the genetic hazard ratio was unrealistically large (>1.6 per allele) and assuming the sum of all nongenetic hazard ratios was <10, bias was usually <10% (downward toward the null). Despite the low bias, the probability that a confidence interval included the true effect decreased (undercoverage) with increasing sample size because of increasing precision. Importantly, false-positive rates were not affected by selection bias.</p><p><strong>Conclusions: </strong>In most empirical settings, selection bias is expected to have a limited impact on genetic effect estimates of subsequent event risk. Nevertheless, because of undercoverage increasing with sample size, most confidence intervals will be over precise (not wide enough). When there is no effect modification by history of coronary heart disease, the false-positive rates of association tests will be close to nominal.</p>","PeriodicalId":10277,"journal":{"name":"Circulation: Cardiovascular Genetics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5659743/pdf/nihms902022.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Circulation: Cardiovascular Genetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1161/CIRCGENETICS.116.001616","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Studies of recurrent or subsequent disease events may be susceptible to bias caused by selection of subjects who both experience and survive the primary indexing event. Currently, the magnitude of any selection bias, particularly for subsequent time-to-event analysis in genetic association studies, is unknown.
Methods and results: We used empirically inspired simulation studies to explore the impact of selection bias on the marginal hazard ratio for risk of subsequent events among those with established coronary heart disease. The extent of selection bias was determined by the magnitudes of genetic and nongenetic effects on the indexing (first) coronary heart disease event. Unless the genetic hazard ratio was unrealistically large (>1.6 per allele) and assuming the sum of all nongenetic hazard ratios was <10, bias was usually <10% (downward toward the null). Despite the low bias, the probability that a confidence interval included the true effect decreased (undercoverage) with increasing sample size because of increasing precision. Importantly, false-positive rates were not affected by selection bias.
Conclusions: In most empirical settings, selection bias is expected to have a limited impact on genetic effect estimates of subsequent event risk. Nevertheless, because of undercoverage increasing with sample size, most confidence intervals will be over precise (not wide enough). When there is no effect modification by history of coronary heart disease, the false-positive rates of association tests will be close to nominal.
期刊介绍:
Circulation: Genomic and Precision Medicine considers all types of original research articles, including studies conducted in human subjects, laboratory animals, in vitro, and in silico. Articles may include investigations of: clinical genetics as applied to the diagnosis and management of monogenic or oligogenic cardiovascular disorders; the molecular basis of complex cardiovascular disorders, including genome-wide association studies, exome and genome sequencing-based association studies, coding variant association studies, genetic linkage studies, epigenomics, transcriptomics, proteomics, metabolomics, and metagenomics; integration of electronic health record data or patient-generated data with any of the aforementioned approaches, including phenome-wide association studies, or with environmental or lifestyle factors; pharmacogenomics; regulation of gene expression; gene therapy and therapeutic genomic editing; systems biology approaches to the diagnosis and management of cardiovascular disorders; novel methods to perform any of the aforementioned studies; and novel applications of precision medicine. Above all, we seek studies with relevance to human cardiovascular biology and disease.