Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables.

ArXiv Pub Date : 2024-09-20
Mariyam Khan, Adriaan-Alexander Ludl, Sean Bankier, Johan Lm Björkegren, Tom Michoel
{"title":"Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables.","authors":"Mariyam Khan, Adriaan-Alexander Ludl, Sean Bankier, Johan Lm Björkegren, Tom Michoel","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for <math><mtext>PHACTR</mtext> <mn>1</mn></math> and <math><mtext>ADAMTS</mtext> <mn>7</mn></math> in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with <i>cis</i>-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10802687/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用相关工具变量集预测具有多向性基因调控效应的 GWAS 基因位点上的因果基因。
多变量孟德尔随机化(Multivariate Mendelian randomization,MVMR)是一种统计技术,它利用成套的遗传工具来估计多种暴露因素对相关结果的直接因果效应。在具有多向基因调控效应的基因组位点上,即相同的基因变异与附近多个基因相关的位点上,MVMR 可用于预测候选因果基因。然而,该领域的共识是 MVMR 中的遗传工具必须是独立的,而在考虑来自同一基因座的一组候选基因时,这通常是不可能的。我们利用因果推理理论证明,具有相关工具的 MVMR 满足工具集条件。这是 Brito 和 Pearl(2002 年)针对结构方程模型得出的经典结果,它保证了在多重暴露共同而非单独地将一组工具变量与结果变量分开的情况下,因果效应的可识别性。广泛的模拟证实了这些理论结果的有效性和实用性,即使样本量不大。重要的是,当工具高度相关时,因果效应估计值仍然是无偏的,其方差也很小。我们利用 STARNET 研究的 eQTL 数据,将 MVMR 应用于冠心病全基因组关联研究(GWAS)风险位点的相关工具变量集。我们的方法预测了 12 个位点的因果基因,每个位点都与多个组织中的多个共位基因相关。然而,由于各组织间存在大量的调控多效性,而每个位点的因果变异体数量有限,因此 MVMR 仍需按组织逐一运行,在单一模型中测试给定位点的所有基因-组织对以预测因果基因-组织组合仍不可行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. A Geometric Tension Dynamics Model of Epithelial Convergent Extension. Learning Molecular Representation in a Cell. Ankle Exoskeletons May Hinder Standing Balance in Simple Models of Older and Younger Adults. Nonparametric causal inference for optogenetics: sequential excursion effects for dynamic regimes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1