{"title":"Estimation of causal effects of genes on complex traits using a Bayesian-network-based framework applied to GWAS data","authors":"Liangying Yin, Yaning Feng, Yujia Shi, Alexandria Lau, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So","doi":"10.1038/s42256-024-00906-7","DOIUrl":null,"url":null,"abstract":"Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1231-1244"},"PeriodicalIF":18.8000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.nature.com/articles/s42256-024-00906-7","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.
期刊介绍:
Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements.
To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects.
Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.