样本重叠和亲缘关系导致多基因风险评分膨胀:重大偏差风险实例。

IF 8.1 1区 生物学 Q1 GENETICS & HEREDITY American journal of human genetics Pub Date : 2024-09-05 Epub Date: 2024-08-20 DOI:10.1016/j.ajhg.2024.07.014
Colin A Ellis, Karen L Oliver, Rebekah V Harris, Ruth Ottman, Ingrid E Scheffer, Heather C Mefford, Michael P Epstein, Samuel F Berkovic, Melanie Bahlo
{"title":"样本重叠和亲缘关系导致多基因风险评分膨胀:重大偏差风险实例。","authors":"Colin A Ellis, Karen L Oliver, Rebekah V Harris, Ruth Ottman, Ingrid E Scheffer, Heather C Mefford, Michael P Epstein, Samuel F Berkovic, Melanie Bahlo","doi":"10.1016/j.ajhg.2024.07.014","DOIUrl":null,"url":null,"abstract":"<p><p>Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1805-1809"},"PeriodicalIF":8.1000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393675/pdf/","citationCount":"0","resultStr":"{\"title\":\"Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.\",\"authors\":\"Colin A Ellis, Karen L Oliver, Rebekah V Harris, Ruth Ottman, Ingrid E Scheffer, Heather C Mefford, Michael P Epstein, Samuel F Berkovic, Melanie Bahlo\",\"doi\":\"10.1016/j.ajhg.2024.07.014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.</p>\",\"PeriodicalId\":7659,\"journal\":{\"name\":\"American journal of human genetics\",\"volume\":\" \",\"pages\":\"1805-1809\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393675/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of human genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ajhg.2024.07.014\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2024.07.014","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/20 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

多基因风险评分(PRS)是了解常见基因变异在人类疾病中作用的重要工具。标准的最佳实践建议,PRS 应在独立于用于得出评分的全基因组关联研究(GWAS)的队列中进行分析,且两个队列之间不存在样本重叠或相关性。然而,在由大型生物库和国际研究联盟进行全基因组关联研究的时代,确定样本重叠和相关性可能具有挑战性。虽然大多数基因组学研究人员都知道 GWAS 和 PRS 队列之间样本重叠和相关性的最佳实践和理论问题,但普遍的假设是,对于非常大的 GWAS,偏倚风险很小。在这里,我们列举了两个真实世界的例子,证明样本重叠和相关性并不是一个次要的或理论上的问题,而是 PRS 研究中一个重要的潜在偏倚来源。通过使用最近开发的统计调整工具,我们发现排除重叠和相关样本的效果与调整重叠偏倚的效果相当,甚至更强。我们的目标是让基因组学研究人员意识到样本重叠和相关性带来的偏倚风险的严重性,并强调对缓解工具的需求,包括 PRS 研究中的独立验证队列、统计调整方法的持续开发,以及让研究人员在不共享个体水平数据的情况下测试其队列与 GWAS 队列的重叠和相关性的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.

Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
14.70
自引率
4.10%
发文量
185
审稿时长
1 months
期刊介绍: The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.
期刊最新文献
The PRIMED Consortium: Reducing disparities in polygenic risk assessment. Comparative analysis of predicted DNA secondary structures infers complex human centromere topology. Toward trustable use of machine learning models of variant effects in the clinic. Allele frequency impacts the cross-ancestry portability of gene expression prediction in lymphoblastoid cell lines. Inherited infertility: Mapping loci associated with impaired female reproduction.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1