Boyang Fu, Prateek Anand, Aakarsh Anand, Joel Mefford, Sriram Sankararaman
{"title":"用于复杂性状可解释外显分析的可扩展自适应二次核方法","authors":"Boyang Fu, Prateek Anand, Aakarsh Anand, Joel Mefford, Sriram Sankararaman","doi":"10.1101/gr.279140.124","DOIUrl":null,"url":null,"abstract":"Our knowledge of the contribution of genetic interactions (epistasis) to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank datasets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects (quadratic effects) within small to medium sized sets of genetic variants (<= 100 SNPs) on a trait and provide quantified interpretation of these effects. Comprehensive simulations showed that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ~ 300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9,515 protein-coding genes. We detected 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (p <= 0.05/(9,515*52) accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is similar to additive effects (median {\\sigma^{2}_{quad}} / {\\sigma^{2}_{g}} = 0.15), with five pairs having a ratio greater than one. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"69 1","pages":""},"PeriodicalIF":6.2000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A scalable adaptive quadratic kernel method for interpretable epistasis analysis in complex traits\",\"authors\":\"Boyang Fu, Prateek Anand, Aakarsh Anand, Joel Mefford, Sriram Sankararaman\",\"doi\":\"10.1101/gr.279140.124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our knowledge of the contribution of genetic interactions (epistasis) to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank datasets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects (quadratic effects) within small to medium sized sets of genetic variants (<= 100 SNPs) on a trait and provide quantified interpretation of these effects. Comprehensive simulations showed that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ~ 300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9,515 protein-coding genes. We detected 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (p <= 0.05/(9,515*52) accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is similar to additive effects (median {\\\\sigma^{2}_{quad}} / {\\\\sigma^{2}_{g}} = 0.15), with five pairs having a ratio greater than one. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance.\",\"PeriodicalId\":12678,\"journal\":{\"name\":\"Genome research\",\"volume\":\"69 1\",\"pages\":\"\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/gr.279140.124\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/gr.279140.124","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
A scalable adaptive quadratic kernel method for interpretable epistasis analysis in complex traits
Our knowledge of the contribution of genetic interactions (epistasis) to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank datasets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects (quadratic effects) within small to medium sized sets of genetic variants (<= 100 SNPs) on a trait and provide quantified interpretation of these effects. Comprehensive simulations showed that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ~ 300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9,515 protein-coding genes. We detected 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (p <= 0.05/(9,515*52) accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is similar to additive effects (median {\sigma^{2}_{quad}} / {\sigma^{2}_{g}} = 0.15), with five pairs having a ratio greater than one. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance.
期刊介绍:
Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine.
Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies.
New data in these areas are published as research papers, or methods and resource reports that provide novel information on technologies or tools that will be of interest to a broad readership. Complete data sets are presented electronically on the journal''s web site where appropriate. The journal also provides Reviews, Perspectives, and Insight/Outlook articles, which present commentary on the latest advances published both here and elsewhere, placing such progress in its broader biological context.