A Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank.

IF 8.1 1区 生物学 Q1 GENETICS & HEREDITY American journal of human genetics Pub Date : 2019-12-05 Epub Date: 2019-11-14 DOI:10.1016/j.ajhg.2019.10.008
Wenjian Bi, Zhangchen Zhao, Rounak Dey, Lars G Fritsche, Bhramar Mukherjee, Seunggeun Lee
{"title":"A Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank.","authors":"Wenjian Bi, Zhangchen Zhao, Rounak Dey, Lars G Fritsche, Bhramar Mukherjee, Seunggeun Lee","doi":"10.1016/j.ajhg.2019.10.008","DOIUrl":null,"url":null,"abstract":"<p><p>The etiology of most complex diseases involves genetic variants, environmental factors, and gene-environment interaction (G × E) effects. Compared with marginal genetic association studies, G × E analysis requires more samples and detailed measure of environmental exposures, and this limits the possible discoveries. Large-scale population-based biobanks with detailed phenotypic and environmental information, such as UK-Biobank, can be ideal resources for identifying G × E effects. However, due to the large computation cost and the presence of case-control imbalance, existing methods often fail. Here we propose a scalable and accurate method, SPAGE (SaddlePoint Approximation implementation of G × E analysis), that is applicable for genome-wide scale phenome-wide G × E studies. SPAGE fits a genotype-independent logistic model only once across the genome-wide analysis in order to reduce computation cost, and SPAGE uses a saddlepoint approximation (SPA) to calibrate the test statistics for analysis of phenotypes with unbalanced case-control ratios. Simulation studies show that SPAGE is 33-79 times faster than the Wald test and 72-439 times faster than the Firth's test, and SPAGE can control type I error rates at the genome-wide significance level even when case-control ratios are extremely unbalanced. Through the analysis of UK-Biobank data of 344,341 white British European-ancestry samples, we show that SPAGE can efficiently analyze large samples while controlling for unbalanced case-control ratios.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1182-1192"},"PeriodicalIF":8.1000,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904814/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2019.10.008","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/11/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

The etiology of most complex diseases involves genetic variants, environmental factors, and gene-environment interaction (G × E) effects. Compared with marginal genetic association studies, G × E analysis requires more samples and detailed measure of environmental exposures, and this limits the possible discoveries. Large-scale population-based biobanks with detailed phenotypic and environmental information, such as UK-Biobank, can be ideal resources for identifying G × E effects. However, due to the large computation cost and the presence of case-control imbalance, existing methods often fail. Here we propose a scalable and accurate method, SPAGE (SaddlePoint Approximation implementation of G × E analysis), that is applicable for genome-wide scale phenome-wide G × E studies. SPAGE fits a genotype-independent logistic model only once across the genome-wide analysis in order to reduce computation cost, and SPAGE uses a saddlepoint approximation (SPA) to calibrate the test statistics for analysis of phenotypes with unbalanced case-control ratios. Simulation studies show that SPAGE is 33-79 times faster than the Wald test and 72-439 times faster than the Firth's test, and SPAGE can control type I error rates at the genome-wide significance level even when case-control ratios are extremely unbalanced. Through the analysis of UK-Biobank data of 344,341 white British European-ancestry samples, we show that SPAGE can efficiently analyze large samples while controlling for unbalanced case-control ratios.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种快速准确的全基因组全表型G × E分析方法及其在英国生物样本库中的应用。
大多数复杂疾病的病因涉及遗传变异、环境因素和基因-环境相互作用(G × E)效应。与边际遗传关联研究相比,gxe分析需要更多的样本和详细的环境暴露测量,这限制了可能的发现。具有详细表型和环境信息的大规模基于人群的生物库,如UK-Biobank,可以成为鉴定G × E效应的理想资源。然而,由于计算成本大,且存在病例控制不平衡,现有方法往往失败。在这里,我们提出了一种可扩展和精确的方法,SPAGE (SaddlePoint Approximation implementation of G × E analysis),它适用于全基因组规模的全表型G × E研究。为了降低计算成本,SPAGE在全基因组分析中只拟合一次与基因型无关的逻辑模型,并且SPAGE使用鞍点近似(SPA)来校准病例对照比不平衡的表型分析的检验统计量。模拟研究表明,SPAGE比Wald检验快33-79倍,比Firth检验快72-439倍,即使在病例-对照比极度不平衡的情况下,SPAGE也能在全基因组显著性水平上控制I型错误率。通过对UK-Biobank中344,341份英国白人欧洲血统样本的分析,我们发现SPAGE可以有效地分析大样本,同时控制不平衡的病例-对照比率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
14.70
自引率
4.10%
发文量
185
审稿时长
1 months
期刊介绍: The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.
期刊最新文献
Variant interpretation training for the genomics era: Learning outcomes to inform professional competencies and education. Bi-allelic loss-of-function variants in JKAMP cause a neurodevelopmental syndrome associated with dysregulation of GPR37 trafficking. SMN1 variants identified by false-positive SMA newborn screening tests: Therapeutic hurdles and functional and epidemiological solutions. Rare-variant aggregation highlights disease-linked genes associated with brain volume variation. Multiple-testing corrections in case-control studies using identity-by-descent segments.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1