Hierarchical Heuristic Species Delimitation under the Multispecies Coalescent Model with Migration

IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Systematic Biology Pub Date : 2024-08-20 DOI:10.1093/sysbio/syae050
Daniel Kornai, Xiyun Jiao, Jiayi Ji, Tomáš Flouri, Ziheng Yang
{"title":"Hierarchical Heuristic Species Delimitation under the Multispecies Coalescent Model with Migration","authors":"Daniel Kornai, Xiyun Jiao, Jiayi Ji, Tomáš Flouri, Ziheng Yang","doi":"10.1093/sysbio/syae050","DOIUrl":null,"url":null,"abstract":"The multispecies coalescent (MSC) model accommodates genealogical fluctuations across the genome and provides a natural framework for comparative analysis of genomic sequence data from closely related species to infer the history of species divergence and gene flow. Given a set of populations, hypotheses of species delimitation (and species phylogeny) may be formulated as instances of MSC models (e.g., MSC for one species versus MSC for two species) and compared using Bayesian model selection. This approach, implemented in the program bpp, has been found to be prone to over-splitting. Alternatively heuristic criteria based on population parameters (such as popula- tion split times, population sizes, and migration rates) estimated from genomic data may be used to delimit species. Here we develop hierarchical merge and split algorithms for heuristic species delimitation based on the genealogical divergence index (𝑔𝑑𝑖) and implement them in a python pipeline called hhsd. We characterize the behavior of the 𝑔𝑑𝑖 under a few simple scenarios of gene flow. We apply the new approaches to a dataset simulated under a model of isolation by distance as well as three empirical datasets. Our tests suggest that the new approaches produced sensible results and were less prone to over-splitting. We discuss possible strategies for accommodating paraphyletic species in the hierarchical algorithm, as well as the challenges of species delimitation based on heuristic criteria.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syae050","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The multispecies coalescent (MSC) model accommodates genealogical fluctuations across the genome and provides a natural framework for comparative analysis of genomic sequence data from closely related species to infer the history of species divergence and gene flow. Given a set of populations, hypotheses of species delimitation (and species phylogeny) may be formulated as instances of MSC models (e.g., MSC for one species versus MSC for two species) and compared using Bayesian model selection. This approach, implemented in the program bpp, has been found to be prone to over-splitting. Alternatively heuristic criteria based on population parameters (such as popula- tion split times, population sizes, and migration rates) estimated from genomic data may be used to delimit species. Here we develop hierarchical merge and split algorithms for heuristic species delimitation based on the genealogical divergence index (𝑔𝑑𝑖) and implement them in a python pipeline called hhsd. We characterize the behavior of the 𝑔𝑑𝑖 under a few simple scenarios of gene flow. We apply the new approaches to a dataset simulated under a model of isolation by distance as well as three empirical datasets. Our tests suggest that the new approaches produced sensible results and were less prone to over-splitting. We discuss possible strategies for accommodating paraphyletic species in the hierarchical algorithm, as well as the challenges of species delimitation based on heuristic criteria.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多物种聚合模型下的分层启发式物种划分与迁移
多物种聚合(MSC)模型可容纳整个基因组的谱系波动,并为近缘物种基因组序列数据的比较分析提供了一个自然框架,以推断物种分化和基因流动的历史。给定一组种群,物种划界(和物种系统发育)的假设可以表述为 MSC 模型的实例(例如,一个物种的 MSC 与两个物种的 MSC),并使用贝叶斯模型选择法进行比较。这种方法已在 bpp 程序中实现,但发现容易造成过度分裂。另一种方法是根据基因组数据估算出的种群参数(如种群分裂时间、种群大小和迁移率),采用启发式标准来划分物种。在此,我们基于系谱学分歧指数(𝑔𝑑𝑖)开发了启发式物种划界的分层合并与拆分算法,并在名为 hhsd 的 python 管道中加以实现。我们描述了几种简单的基因流动情况下 𝑔𝑖𝑑的行为特征。我们将新方法应用于在距离隔离模型下模拟的数据集以及三个经验数据集。我们的测试表明,新方法产生了合理的结果,而且不容易出现过度分裂。我们讨论了在分层算法中容纳旁系物种的可能策略,以及基于启发式标准的物种划分所面临的挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Systematic Biology
Systematic Biology 生物-进化生物学
CiteScore
13.00
自引率
7.70%
发文量
70
审稿时长
6-12 weeks
期刊介绍: Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.
期刊最新文献
How to validate a Bayesian evolutionary model. Evolution of Large Eyes in Stromboidea (Gastropoda): Impact of Photic Environment and Life History Traits. Rapid Evolution of Host Repertoire and Geographic Range in a Young and Diverse Genus of Montane Butterflies. Towards Reliable Detection of Introgression in the Presence of Among-Species Rate Variation. Toward a Semi-Supervised Learning Approach to Phylogenetic Estimation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1