Thibaut Sellinger, Frank Johannes, Aurélien Tellier
{"title":"通过整合基因组和表观基因组数据改进对种群历史的推断","authors":"Thibaut Sellinger, Frank Johannes, Aurélien Tellier","doi":"https://doi.org/10.7554/elife.89470.4","DOIUrl":null,"url":null,"abstract":"With the availability of high-quality full genome polymorphism (SNPs) data, it becomes feasible to study the past demographic and selective history of populations in exquisite detail. However, such inferences still suffer from a lack of statistical resolution for recent, for example bottlenecks, events, and/or for populations with small nucleotide diversity. Additional heritable (epi)genetic markers, such as indels, transposable elements, microsatellites, or cytosine methylation, may provide further, yet untapped, information on the recent past population history. We extend the Sequential Markovian Coalescent (SMC) framework to jointly use SNPs and other hyper-mutable markers. We are able to (1) improve the accuracy of demographic inference in recent times, (2) uncover past demographic events hidden to SNP-based inference methods, and (3) infer the hyper-mutable marker mutation rates under a finite site model. As a proof of principle, we focus on demographic inference in <i>Arabidopsis thaliana</i> using DNA methylation diversity data from 10 European natural accessions. We demonstrate that segregating single methylated polymorphisms (SMPs) satisfy the modeling assumptions of the SMC framework, while differentially methylated regions (DMRs) are not suitable as their length exceeds that of the genomic distance between two recombination events. Combining SNPs and SMPs while accounting for site- and region-level epimutation processes, we provide new estimates of the glacial age bottleneck and post-glacial population expansion of the European <i>A. thaliana</i> population. Our SMC framework readily accounts for a wide range of heritable genomic markers, thus paving the way for next-generation inference of evolutionary history by combining information from several genetic and epigenetic markers.","PeriodicalId":11640,"journal":{"name":"eLife","volume":null,"pages":null},"PeriodicalIF":6.4000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved inference of population histories by integrating genomic and epigenomic data\",\"authors\":\"Thibaut Sellinger, Frank Johannes, Aurélien Tellier\",\"doi\":\"https://doi.org/10.7554/elife.89470.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the availability of high-quality full genome polymorphism (SNPs) data, it becomes feasible to study the past demographic and selective history of populations in exquisite detail. However, such inferences still suffer from a lack of statistical resolution for recent, for example bottlenecks, events, and/or for populations with small nucleotide diversity. Additional heritable (epi)genetic markers, such as indels, transposable elements, microsatellites, or cytosine methylation, may provide further, yet untapped, information on the recent past population history. We extend the Sequential Markovian Coalescent (SMC) framework to jointly use SNPs and other hyper-mutable markers. We are able to (1) improve the accuracy of demographic inference in recent times, (2) uncover past demographic events hidden to SNP-based inference methods, and (3) infer the hyper-mutable marker mutation rates under a finite site model. As a proof of principle, we focus on demographic inference in <i>Arabidopsis thaliana</i> using DNA methylation diversity data from 10 European natural accessions. We demonstrate that segregating single methylated polymorphisms (SMPs) satisfy the modeling assumptions of the SMC framework, while differentially methylated regions (DMRs) are not suitable as their length exceeds that of the genomic distance between two recombination events. Combining SNPs and SMPs while accounting for site- and region-level epimutation processes, we provide new estimates of the glacial age bottleneck and post-glacial population expansion of the European <i>A. thaliana</i> population. Our SMC framework readily accounts for a wide range of heritable genomic markers, thus paving the way for next-generation inference of evolutionary history by combining information from several genetic and epigenetic markers.\",\"PeriodicalId\":11640,\"journal\":{\"name\":\"eLife\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"eLife\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/https://doi.org/10.7554/elife.89470.4\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"eLife","FirstCategoryId":"99","ListUrlMain":"https://doi.org/https://doi.org/10.7554/elife.89470.4","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
有了高质量的全基因组多态性(SNPs)数据,就可以对种群过去的人口和选择性历史进行详细研究。然而,这种推断仍然缺乏对近期事件(如瓶颈事件)和/或核苷酸多样性较小的种群的统计分辨率。额外的遗传(表)标记,如吲哚、转座元素、微卫星或胞嘧啶甲基化,可能会提供更多有关近期种群历史的信息,但这些信息尚未被开发。我们扩展了序列马尔可夫聚合(SMC)框架,以联合使用 SNP 和其他超缄默标记。我们能够:(1)提高近期人口推断的准确性;(2)发现隐藏在基于 SNP 的推断方法中的过去人口事件;(3)在有限位点模型下推断超可变标记突变率。作为原理验证,我们利用来自 10 个欧洲天然品种的 DNA 甲基化多样性数据,重点研究了拟南芥的人口推断。我们证明,分离的单甲基化多态性(SMPs)符合 SMC 框架的建模假设,而差异甲基化区域(DMRs)则不适合,因为它们的长度超过了两个重组事件之间的基因组距离。结合 SNPs 和 SMPs,同时考虑到位点和区域水平的外显子突变过程,我们提供了关于欧洲蛛形目种群冰川时代瓶颈和冰川后种群扩张的新估计。我们的 SMC 框架可轻松解释各种可遗传的基因组标记,从而通过结合多个遗传标记和表观遗传标记的信息,为下一代推断进化历史铺平了道路。
Improved inference of population histories by integrating genomic and epigenomic data
With the availability of high-quality full genome polymorphism (SNPs) data, it becomes feasible to study the past demographic and selective history of populations in exquisite detail. However, such inferences still suffer from a lack of statistical resolution for recent, for example bottlenecks, events, and/or for populations with small nucleotide diversity. Additional heritable (epi)genetic markers, such as indels, transposable elements, microsatellites, or cytosine methylation, may provide further, yet untapped, information on the recent past population history. We extend the Sequential Markovian Coalescent (SMC) framework to jointly use SNPs and other hyper-mutable markers. We are able to (1) improve the accuracy of demographic inference in recent times, (2) uncover past demographic events hidden to SNP-based inference methods, and (3) infer the hyper-mutable marker mutation rates under a finite site model. As a proof of principle, we focus on demographic inference in Arabidopsis thaliana using DNA methylation diversity data from 10 European natural accessions. We demonstrate that segregating single methylated polymorphisms (SMPs) satisfy the modeling assumptions of the SMC framework, while differentially methylated regions (DMRs) are not suitable as their length exceeds that of the genomic distance between two recombination events. Combining SNPs and SMPs while accounting for site- and region-level epimutation processes, we provide new estimates of the glacial age bottleneck and post-glacial population expansion of the European A. thaliana population. Our SMC framework readily accounts for a wide range of heritable genomic markers, thus paving the way for next-generation inference of evolutionary history by combining information from several genetic and epigenetic markers.
期刊介绍:
eLife is a distinguished, not-for-profit, peer-reviewed open access scientific journal that specializes in the fields of biomedical and life sciences. eLife is known for its selective publication process, which includes a variety of article types such as:
Research Articles: Detailed reports of original research findings.
Short Reports: Concise presentations of significant findings that do not warrant a full-length research article.
Tools and Resources: Descriptions of new tools, technologies, or resources that facilitate scientific research.
Research Advances: Brief reports on significant scientific advancements that have immediate implications for the field.
Scientific Correspondence: Short communications that comment on or provide additional information related to published articles.
Review Articles: Comprehensive overviews of a specific topic or field within the life sciences.