Deep learning insights into distinct patterns of polygenic adaptation across human populations.

IF 16.6 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Nucleic Acids Research Pub Date : 2024-11-21 DOI:10.1093/nar/gkae1027
Devashish Tripathi, Chandrika Bhattacharyya, Analabha Basu
{"title":"Deep learning insights into distinct patterns of polygenic adaptation across human populations.","authors":"Devashish Tripathi, Chandrika Bhattacharyya, Analabha Basu","doi":"10.1093/nar/gkae1027","DOIUrl":null,"url":null,"abstract":"<p><p>Response to spatiotemporal variation in selection gradients resulted in signatures of polygenic adaptation in human genomes. We introduce RAISING, a two-stage deep learning framework that optimizes neural network architecture through hyperparameter tuning before performing feature selection and prediction tasks. We tested RAISING on published and newly designed simulations that incorporate the complex interplay between demographic history and selection gradients. RAISING outperformed Phylogenetic Generalized Least Squares (PGLS), ridge regression and DeepGenomeScan, with significantly higher true positive rates (TPR) in detecting genetic adaptation. It reduced computational time by 60-fold and increased TPR by up to 28% compared to DeepGenomeScan on published data. In more complex demographic simulations, RAISING showed lower false discoveries and significantly higher TPR, up to 17-fold, compared to other methods. RAISING demonstrated robustness with least sensitivity to demographic history, selection gradient and their interactions. We developed a sliding window method for genome-wide implementation of RAISING to overcome the computational challenges of high-dimensional genomic data. Applied to African, European, South Asian and East Asian populations, we identified multiple genomic regions undergoing polygenic selection. Notably, ∼70% of the regions identified in Africans are unique, with broad patterns distinguishing them from non-Africans, corroborating the Out of Africa dispersal model.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkae1027","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Response to spatiotemporal variation in selection gradients resulted in signatures of polygenic adaptation in human genomes. We introduce RAISING, a two-stage deep learning framework that optimizes neural network architecture through hyperparameter tuning before performing feature selection and prediction tasks. We tested RAISING on published and newly designed simulations that incorporate the complex interplay between demographic history and selection gradients. RAISING outperformed Phylogenetic Generalized Least Squares (PGLS), ridge regression and DeepGenomeScan, with significantly higher true positive rates (TPR) in detecting genetic adaptation. It reduced computational time by 60-fold and increased TPR by up to 28% compared to DeepGenomeScan on published data. In more complex demographic simulations, RAISING showed lower false discoveries and significantly higher TPR, up to 17-fold, compared to other methods. RAISING demonstrated robustness with least sensitivity to demographic history, selection gradient and their interactions. We developed a sliding window method for genome-wide implementation of RAISING to overcome the computational challenges of high-dimensional genomic data. Applied to African, European, South Asian and East Asian populations, we identified multiple genomic regions undergoing polygenic selection. Notably, ∼70% of the regions identified in Africans are unique, with broad patterns distinguishing them from non-Africans, corroborating the Out of Africa dispersal model.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度学习洞察人类种群多基因适应的独特模式。
对选择梯度时空变化的响应导致了人类基因组中的多基因适应特征。我们介绍了 RAISING,这是一种两阶段深度学习框架,在执行特征选择和预测任务之前,通过超参数调整优化神经网络架构。我们在已发表和新设计的模拟上测试了 RAISING,这些模拟包含了人口历史和选择梯度之间复杂的相互作用。RAISING 的表现优于系统发育广义最小二乘法(PGLS)、脊回归和 DeepGenomeScan,在检测遗传适应方面的真阳性率(TPR)明显更高。与 DeepGenomeScan 相比,它在已发布数据上的计算时间缩短了 60 倍,TPR 提高了 28%。在更复杂的人口模拟中,与其他方法相比,RAISING 的错误发现率更低,TPR 明显更高,最高可达 17 倍。RAISING 对人口历史、选择梯度及其相互作用的敏感性最低,表现出很强的鲁棒性。我们为 RAISING 的全基因组实施开发了一种滑动窗口方法,以克服高维基因组数据的计算挑战。应用于非洲、欧洲、南亚和东亚种群,我们发现了多个正在进行多基因选择的基因组区域。值得注意的是,在非洲人中发现的区域有 70% 是独一无二的,他们与非非洲人之间有着广泛的模式区别,这证实了 "走出非洲"(Out of Africa)的扩散模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nucleic Acids Research
Nucleic Acids Research 生物-生化与分子生物学
CiteScore
27.10
自引率
4.70%
发文量
1057
审稿时长
2 months
期刊介绍: Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.
期刊最新文献
Deep learning insights into distinct patterns of polygenic adaptation across human populations. Single-stranded DNA with internal base modifications mediates highly efficient knock-in in primary cells using CRISPR-Cas9 CATH v4.4: major expansion of CATH by experimental and predicted structural data L1-ORF1p nucleoprotein can rapidly assume distinct conformations and simultaneously bind more than one nucleic acid Harmonizome 3.0: integrated knowledge about genes and proteins from diverse multi-omics resources
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1