Yield of genetic association signals from genomes, exomes and imputation in the UK Biobank

IF 31.7 1区 生物学 Q1 GENETICS & HEREDITY Nature genetics Pub Date : 2024-09-25 DOI:10.1038/s41588-024-01930-4
Sheila M. Gaynor, Tyler Joseph, Xiaodong Bai, Yuxin Zou, Boris Boutkov, Evan K. Maxwell, Olivier Delaneau, Robin J. Hofmeister, Olga Krasheninina, Suganthi Balasubramanian, Anthony Marcketta, Joshua Backman, Regeneron Genetics Center, Jeffrey G. Reid, John D. Overton, Luca A. Lotta, Jonathan Marchini, William J. Salerno, Aris Baras, Goncalo R. Abecasis, Timothy A. Thornton
{"title":"Yield of genetic association signals from genomes, exomes and imputation in the UK Biobank","authors":"Sheila M. Gaynor, Tyler Joseph, Xiaodong Bai, Yuxin Zou, Boris Boutkov, Evan K. Maxwell, Olivier Delaneau, Robin J. Hofmeister, Olga Krasheninina, Suganthi Balasubramanian, Anthony Marcketta, Joshua Backman, Regeneron Genetics Center, Jeffrey G. Reid, John D. Overton, Luca A. Lotta, Jonathan Marchini, William J. Salerno, Aris Baras, Goncalo R. Abecasis, Timothy A. Thornton","doi":"10.1038/s41588-024-01930-4","DOIUrl":null,"url":null,"abstract":"Whole-genome sequencing (WGS), whole-exome sequencing (WES) and array genotyping with imputation (IMP) are common strategies for assessing genetic variation and its association with medically relevant phenotypes. To date, there has been no systematic empirical assessment of the yield of these approaches when applied to hundreds of thousands of samples to enable the discovery of complex trait genetic signals. Using data for 100 complex traits from 149,195 individuals in the UK Biobank, we systematically compare the relative yield of these strategies in genetic association studies. We find that WGS and WES combined with arrays and imputation (WES + IMP) have the largest association yield. Although WGS results in an approximately fivefold increase in the total number of assayed variants over WES + IMP, the number of detected signals differed by only 1% for both single-variant and gene-based association analyses. Given that WES + IMP typically results in savings of lab and computational time and resources expended per sample, we evaluate the potential benefits of applying WES + IMP to larger samples. When we extend our WES + IMP analyses to 468,169 UK Biobank individuals, we observe an approximately fourfold increase in association signals with the threefold increase in sample size. We conclude that prioritizing WES + IMP and large sample sizes rather than contemporary short-read WGS alternatives will maximize the number of discoveries in genetic association studies. Comparison of association signals in UK Biobank using different strategies for assessing genetic variation shows that whole-exome sequencing combined with array genotyping and imputation offers similar performance to whole-genome sequencing at a reduced cost.","PeriodicalId":18985,"journal":{"name":"Nature genetics","volume":null,"pages":null},"PeriodicalIF":31.7000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41588-024-01930-4.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature genetics","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41588-024-01930-4","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Whole-genome sequencing (WGS), whole-exome sequencing (WES) and array genotyping with imputation (IMP) are common strategies for assessing genetic variation and its association with medically relevant phenotypes. To date, there has been no systematic empirical assessment of the yield of these approaches when applied to hundreds of thousands of samples to enable the discovery of complex trait genetic signals. Using data for 100 complex traits from 149,195 individuals in the UK Biobank, we systematically compare the relative yield of these strategies in genetic association studies. We find that WGS and WES combined with arrays and imputation (WES + IMP) have the largest association yield. Although WGS results in an approximately fivefold increase in the total number of assayed variants over WES + IMP, the number of detected signals differed by only 1% for both single-variant and gene-based association analyses. Given that WES + IMP typically results in savings of lab and computational time and resources expended per sample, we evaluate the potential benefits of applying WES + IMP to larger samples. When we extend our WES + IMP analyses to 468,169 UK Biobank individuals, we observe an approximately fourfold increase in association signals with the threefold increase in sample size. We conclude that prioritizing WES + IMP and large sample sizes rather than contemporary short-read WGS alternatives will maximize the number of discoveries in genetic association studies. Comparison of association signals in UK Biobank using different strategies for assessing genetic variation shows that whole-exome sequencing combined with array genotyping and imputation offers similar performance to whole-genome sequencing at a reduced cost.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
英国生物库中基因组、外显子组和估算的遗传关联信号的产量
全基因组测序(WGS)、全外显子组测序(WES)和带归因的阵列基因分型(IMP)是评估遗传变异及其与医学相关表型关联的常用策略。迄今为止,这些方法在应用于数十万样本以发现复杂性状遗传信号时的收益还没有系统的实证评估。利用英国生物库中 149195 个个体的 100 个复杂性状的数据,我们系统地比较了这些策略在遗传关联研究中的相对收益。我们发现,WGS 和 WES 结合阵列和归因(WES + IMP)的关联收益最大。虽然与 WES + IMP 相比,WGS 使检测变体总数增加了约五倍,但在单变体和基于基因的关联分析中,检测到的信号数量仅相差 1%。鉴于 WES + IMP 通常可以节省每个样本的实验室和计算时间及资源,我们评估了将 WES + IMP 应用于更大样本的潜在优势。当我们将 WES + IMP 分析扩展到 468,169 个英国生物库个体时,我们发现随着样本量增加三倍,关联信号增加了约四倍。我们的结论是,优先考虑 WES + IMP 和大样本量,而不是当代的短读数 WGS,将最大限度地增加遗传关联研究的发现数量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nature genetics
Nature genetics 生物-遗传学
CiteScore
43.00
自引率
2.60%
发文量
241
审稿时长
3 months
期刊介绍: Nature Genetics publishes the very highest quality research in genetics. It encompasses genetic and functional genomic studies on human and plant traits and on other model organisms. Current emphasis is on the genetic basis for common and complex diseases and on the functional mechanism, architecture and evolution of gene networks, studied by experimental perturbation. Integrative genetic topics comprise, but are not limited to: -Genes in the pathology of human disease -Molecular analysis of simple and complex genetic traits -Cancer genetics -Agricultural genomics -Developmental genetics -Regulatory variation in gene expression -Strategies and technologies for extracting function from genomic data -Pharmacological genomics -Genome evolution
期刊最新文献
Toward advances in retinoblastoma genetics in Kenya Genetic architecture of cerebrospinal fluid and brain metabolite levels and the genetic colocalization of metabolites with human traits Brca1 haploinsufficiency promotes early tumor onset and epigenetic alterations in a mouse model of hereditary breast cancer Proteogenomic analysis of human cerebrospinal fluid identifies neurologically relevant regulation and implicates causal proteins for Alzheimer’s disease Genetics and dietary restriction impact lifespan
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1