Using Genomic Context Informed Genotype Data and Within‐model Ancestry Adjustment to Classify Type 2 Diabetes

Eric J Barnett, Yanli Zhang-James, Jonathan Hess, Stephen J Glatt, Stephen V Faraone
{"title":"Using Genomic Context Informed Genotype Data and Within‐model Ancestry Adjustment to Classify Type 2 Diabetes","authors":"Eric J Barnett, Yanli Zhang-James, Jonathan Hess, Stephen J Glatt, Stephen V Faraone","doi":"10.1101/2024.09.12.24313579","DOIUrl":null,"url":null,"abstract":"Despite high heritability estimates, complex genetic disorders have proven difficult to predict with genetic data. Genomic research has documented polygenic inheritance, cross-disorder genetic correlations, and enrichment of risk by functional genomic annotation, but the vast potential of that combined knowledge has not yet been leveraged to build optimal risk models. Additional methods are likely required to progress genetic risk models of complex genetic disorders towards clinical utility. We developed a framework that uses annotations providing genomic context alongside genotype data as input to convolutional neural networks to predict disorder risk. We validated models in a matched-pairs type 2 diabetes dataset. A neural network using genotype data (AUC: 0.66) and a convolutional neural network using context-informed genotype data (AUC: 0.65) both significantly outperformed polygenic risk score approaches in classifying type-2 diabetes. Adversarial ancestry tasks eliminated the predictability of ancestry without changing model performance.","PeriodicalId":501375,"journal":{"name":"medRxiv - Genetic and Genomic Medicine","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Genetic and Genomic Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.12.24313579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Despite high heritability estimates, complex genetic disorders have proven difficult to predict with genetic data. Genomic research has documented polygenic inheritance, cross-disorder genetic correlations, and enrichment of risk by functional genomic annotation, but the vast potential of that combined knowledge has not yet been leveraged to build optimal risk models. Additional methods are likely required to progress genetic risk models of complex genetic disorders towards clinical utility. We developed a framework that uses annotations providing genomic context alongside genotype data as input to convolutional neural networks to predict disorder risk. We validated models in a matched-pairs type 2 diabetes dataset. A neural network using genotype data (AUC: 0.66) and a convolutional neural network using context-informed genotype data (AUC: 0.65) both significantly outperformed polygenic risk score approaches in classifying type-2 diabetes. Adversarial ancestry tasks eliminated the predictability of ancestry without changing model performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用基因组上下文信息基因型数据和模型内祖先调整对 2 型糖尿病进行分类
尽管遗传率估计值很高,但事实证明复杂的遗传性疾病很难通过遗传数据进行预测。基因组研究记录了多基因遗传、跨疾病遗传相关性以及功能基因组注释对风险的丰富,但尚未利用这些综合知识的巨大潜力来建立最佳风险模型。要将复杂遗传性疾病的遗传风险模型推向临床应用,可能还需要其他方法。我们开发了一个框架,利用提供基因组背景的注释和基因型数据作为卷积神经网络的输入来预测疾病风险。我们在配对的 2 型糖尿病数据集中验证了模型。在对 2 型糖尿病进行分类时,使用基因型数据的神经网络(AUC:0.66)和使用上下文信息基因型数据的卷积神经网络(AUC:0.65)都明显优于多基因风险评分方法。对抗性祖先任务消除了祖先的可预测性,却没有改变模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Identifying individuals at risk for surgical supravalvar aortic stenosis by polygenic risk score with graded phenotyping Exome wide association study for blood lipids in 1,158,017 individuals from diverse populations Allelic effects on KLHL17 expression likely mediated by JunB/D underlie a PDAC GWAS signal at chr1p36.33 Genetic associations between SGLT2 inhibition, DPP4 inhibition or GLP1R agonism and prostate cancer risk: a two-sample Mendelian randomisation study A Genome-wide Association Study Identifies Novel Genetic Variants Associated with Knee Pain in the UK Biobank (N = 441,757)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1