Systematic Biology最新文献_第4页

Genomic and phenotypic delimitation of species in a temperate aquatic biodiversity hotspot 温带水生生物多样性热点地区物种的基因组和表型划界

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-24 DOI: 10.1093/sysbio/syaf083

Daniel J MacGuigan, Adam Taylor, Ava Ghezelayagh, Julia E Wood, Jeffrey W Simmons, Jon M Mollish, Thomas J Near

Biologists have relied on morphological characteristics to identify, define, and formally describe species for the past 250 years. The advent of phylogenetic species concepts and the introduction of molecular data have spawned new species delimitation methods applicable to a wide range of eukaryotic lineages. However, these approaches heavily emphasize genomic data, often overlooking phenotypic traits. We present and implement a species delimitation approach that utilizes genome-wide markers from ddRAD-seq and meristic morphological traits, which have long been used to identify and delineate fish species. Our methodology employs unsupervised machine learning to analyze morphological data without a priori species assignments, allowing phenotypic patterns to emerge independently from genomic-based species delimitation. We apply our combined genomic and phenotypic methodology to the freshwater systems of Southeastern North America, a biodiversity hotspot where conservation efforts are hampered by an incomplete knowledge of species diversity. Our investigation focuses on the darter clade Allohistium, a threatened lineage comprising two described species. Through phylogenomic, population genetic, and phenotypic model comparisons, we provide evidence supporting the delimitation of a third species of Allohistium, which we formally describe. Our approach shows how unsupervised machine learning can reveal cryptic morphological diversity that might otherwise be obscured by taxonomic preconceptions. This study demonstrates that model testing using diverse lines of evidence yields a more comprehensive, data-driven hypothesis of species diversity.

在过去的250年里，生物学家一直依靠形态特征来识别、定义和正式描述物种。系统发育物种概念的出现和分子数据的引入催生了新的物种划分方法，适用于广泛的真核生物谱系。然而，这些方法过分强调基因组数据，往往忽略了表型特征。我们提出并实施了一种物种划分方法，该方法利用来自ddRAD-seq的全基因组标记和分生形态学特征，这些标记长期以来一直用于鉴定和描绘鱼类。我们的方法采用无监督机器学习来分析形态数据，而无需先验的物种分配，允许表型模式独立于基于基因组的物种划分而出现。我们将基因组学和表型学相结合的方法应用于北美东南部的淡水系统，这是一个生物多样性热点，由于物种多样性的不完整知识，保护工作受到阻碍。我们的调查集中在镖枝Allohistium，一个受威胁的谱系包括两个描述的物种。通过系统基因组学、种群遗传学和表型模型比较，我们提供了支持第三种异源组菌的划界的证据，我们正式描述了这一物种。我们的方法展示了无监督机器学习如何揭示潜在的形态多样性，否则这些多样性可能会被分类学的先入之见所掩盖。这项研究表明，使用不同证据线的模型测试产生了一个更全面的、数据驱动的物种多样性假设。

{"title":"Genomic and phenotypic delimitation of species in a temperate aquatic biodiversity hotspot","authors":"Daniel J MacGuigan, Adam Taylor, Ava Ghezelayagh, Julia E Wood, Jeffrey W Simmons, Jon M Mollish, Thomas J Near","doi":"10.1093/sysbio/syaf083","DOIUrl":"https://doi.org/10.1093/sysbio/syaf083","url":null,"abstract":"Biologists have relied on morphological characteristics to identify, define, and formally describe species for the past 250 years. The advent of phylogenetic species concepts and the introduction of molecular data have spawned new species delimitation methods applicable to a wide range of eukaryotic lineages. However, these approaches heavily emphasize genomic data, often overlooking phenotypic traits. We present and implement a species delimitation approach that utilizes genome-wide markers from ddRAD-seq and meristic morphological traits, which have long been used to identify and delineate fish species. Our methodology employs unsupervised machine learning to analyze morphological data without a priori species assignments, allowing phenotypic patterns to emerge independently from genomic-based species delimitation. We apply our combined genomic and phenotypic methodology to the freshwater systems of Southeastern North America, a biodiversity hotspot where conservation efforts are hampered by an incomplete knowledge of species diversity. Our investigation focuses on the darter clade Allohistium, a threatened lineage comprising two described species. Through phylogenomic, population genetic, and phenotypic model comparisons, we provide evidence supporting the delimitation of a third species of Allohistium, which we formally describe. Our approach shows how unsupervised machine learning can reveal cryptic morphological diversity that might otherwise be obscured by taxonomic preconceptions. This study demonstrates that model testing using diverse lines of evidence yields a more comprehensive, data-driven hypothesis of species diversity.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"29 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145609193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dating the Bacterial Tree of Life Based on Ancient Symbiosis. 根据古代共生关系确定细菌生命树的年代。

IF 5.7 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-22 DOI: 10.1093/sysbio/syae071

Sishuo Wang, Haiwei Luo

Obtaining a timescale for bacterial evolution is crucial to understand early life evolution but is difficult owing to the scarcity of bacterial fossils. Here, we introduce multiple new time constraints to calibrate bacterial evolution based on ancient symbiosis. This idea is implemented using a bacterial tree constructed with genes found in the mitochondrial lineages phylogenetically embedded within Proteobacteria. The expanded mitochondria-bacterial tree allows the node age constraints of eukaryotes established by their abundant fossils to be propagated to ancient co-evolving bacterial symbionts and across the bacterial tree of life. Importantly, we formulate a new probabilistic framework that considers uncertainty in inference of the ancestral lifestyle of modern symbionts to apply 19 relative time constraints each informed by host-symbiont association to constrain bacterial symbionts no older than their eukaryotic host. Moreover, we develop an approach to incorporating substitution mixture models that better accommodate substitutional saturation and compositional heterogeneity for dating deep phylogenies. Our analysis estimates that the last bacterial common ancestor occurred approximately 4.0-3.5 billion years ago (Ga), followed by rapid divergence of major bacterial clades. It is generally robust to alternative root ages, root positions, tree topologies, fossil ages, ancestral lifestyle reconstruction, gene sets, among other factors. The obtained timetree serves as a foundation for testing hypotheses regarding bacterial diversification and its correlation with geobiological events across different timescales.

获得细菌进化的时间尺度对于理解早期生命进化至关重要，但由于细菌化石的稀缺性，这是困难的。在这里，我们引入了多个新的时间约束来校准基于古代共生的细菌进化。这个想法是通过一个细菌树来实现的，这个细菌树是由在变形杆菌中嵌入的线粒体谱系中发现的基因构建的。扩大的线粒体-细菌树允许真核生物通过丰富的化石建立的节点年龄限制传播到古老的共同进化的细菌共生体和整个细菌生命树。重要的是，我们制定了一个新的概率框架，该框架考虑了现代共生体祖先生活方式推断的不确定性，并应用19个相对时间约束（RTC），每个RTC都由宿主-共生体关联通知，以约束不超过其真核宿主的细菌共生体。此外，我们开发了一种结合替代混合物模型的方法，该模型可以更好地适应取代饱和度和成分异质性，以确定深层系统发育的年代。我们的分析估计，最后的细菌共同祖先（LBCA）大约发生在40 - 35亿年前（Ga），随后是主要细菌分支的快速分化。它对不同的根年龄、根位置、树的拓扑结构、化石年龄、祖先生活方式重建、基因集等因素都具有普遍的鲁棒性。获得的时间表可作为检验关于细菌多样化及其与不同时间尺度的地质生物学事件的相关性的假设的基础。

{"title":"Dating the Bacterial Tree of Life Based on Ancient Symbiosis.","authors":"Sishuo Wang, Haiwei Luo","doi":"10.1093/sysbio/syae071","DOIUrl":"10.1093/sysbio/syae071","url":null,"abstract":"Obtaining a timescale for bacterial evolution is crucial to understand early life evolution but is difficult owing to the scarcity of bacterial fossils. Here, we introduce multiple new time constraints to calibrate bacterial evolution based on ancient symbiosis. This idea is implemented using a bacterial tree constructed with genes found in the mitochondrial lineages phylogenetically embedded within Proteobacteria. The expanded mitochondria-bacterial tree allows the node age constraints of eukaryotes established by their abundant fossils to be propagated to ancient co-evolving bacterial symbionts and across the bacterial tree of life. Importantly, we formulate a new probabilistic framework that considers uncertainty in inference of the ancestral lifestyle of modern symbionts to apply 19 relative time constraints each informed by host-symbiont association to constrain bacterial symbionts no older than their eukaryotic host. Moreover, we develop an approach to incorporating substitution mixture models that better accommodate substitutional saturation and compositional heterogeneity for dating deep phylogenies. Our analysis estimates that the last bacterial common ancestor occurred approximately 4.0-3.5 billion years ago (Ga), followed by rapid divergence of major bacterial clades. It is generally robust to alternative root ages, root positions, tree topologies, fossil ages, ancestral lifestyle reconstruction, gene sets, among other factors. The obtained timetree serves as a foundation for testing hypotheses regarding bacterial diversification and its correlation with geobiological events across different timescales.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"639-655"},"PeriodicalIF":5.7,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12640082/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143024894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Link between the Birth-Death Process and the Kingman Coalescent-Applications to Phylogenetic Epidemiology. 生-死过程与金曼凝聚之间的联系——在系统发育流行病学中的应用。

IF 5.7 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-22 DOI: 10.1093/sysbio/syaf024

Josselin Cornuault, Fabio Pardi, Celine Scornavacca

The two most popular tree models used in phylogenetics are the birth-death process (BD) and the Kingman coalescent (KC). These two models differ in several respects, notably: (i) the curve of the population size through time is a stochastic process in the BD, versus a parametrized curve in the KC, (ii) the BD makes assumptions about the way samples are collected, while the KC conditions on the number of samples and the collection times, thus bypassing the need to describe the sampling procedure. These two models have been applied to different contexts: the BD in macroevolutionary studies of clades of species, and the KC for populations. The exception is the field of phylogenetic epidemiology which uses both models. This then asks the question of how such different models can be used in the same context. In this paper, we study large-population limits of the BD, in a search for a mathematical link between the BD and the KC. We show that the KC is the large-population limit of a BD conditioned on a given population trajectory, and we provide the formula for the parameter Î¸ of the limiting KC. This formula appears in earlier studies, but the present article is the first to show formally how the correspondence arises as a large-population limit, and that the BD needs to be conditioned for the KC to arise. Besides these fundamentally mathematical results, we demonstrate how our findings can be used practically in phylogenetic inference. In particular, we propose a new method for phylogenetic epidemiology, called CalicoBird, ensuing from our results. We conjecture that this new method, used in conjunction with auxiliary data (e.g. prevalence or incidence data), should allow estimating important epidemiological parameters (e.g. the prevalence and the effective reproduction number), in a way that is robust to the data-generating model and the sampling procedure. Future studies will be needed to put our claims to the test.

在系统发育学中使用的两种最流行的树模型是出生-死亡过程（BD）和金曼聚结（KC）。这两个模型在几个方面有所不同，值得注意的是：(i)总体规模随时间的曲线在BD中是一个随机过程，而在KC中是一个参数化曲线，（ii） BD对样本的收集方式进行假设，而KC则对样本数量和收集时间进行限制，因此无需描述采样过程。这两种模型已被应用于不同的环境：BD用于物种枝的宏观进化研究，KC用于种群研究。唯一的例外是系统发育流行病学领域，它使用了这两种模型。这就提出了这样一个问题，即如何在相同的上下文中使用这些不同的模型。在本文中,我们研究BD的庞大的人口限制,在寻找一个数学BD和KC。我们之间的联系表明,KC的庞大的人口限制人口BD条件在给定轨迹,和我们提供的公式参数θ的限制KC。这个公式出现在早期的研究,但是本文首次显示正式信件时如何作为一个庞大的人口限制,BD需要为KC的出现提供条件。除了这些基本的数学结果外，我们还展示了我们的发现如何在系统发育推断中实际使用。特别地，根据我们的研究结果，我们提出了一种新的系统发育流行病学方法calicbird。我们推测，这种新方法与辅助数据（如患病率或发病率数据）结合使用，应该能够以一种对数据生成模型和抽样程序具有鲁棒性的方式估计重要的流行病学参数（如患病率和有效繁殖数）。还需要进一步的研究来验证我们的说法。

{"title":"Link between the Birth-Death Process and the Kingman Coalescent-Applications to Phylogenetic Epidemiology.","authors":"Josselin Cornuault, Fabio Pardi, Celine Scornavacca","doi":"10.1093/sysbio/syaf024","DOIUrl":"10.1093/sysbio/syaf024","url":null,"abstract":"The two most popular tree models used in phylogenetics are the birth-death process (BD) and the Kingman coalescent (KC). These two models differ in several respects, notably: (i) the curve of the population size through time is a stochastic process in the BD, versus a parametrized curve in the KC, (ii) the BD makes assumptions about the way samples are collected, while the KC conditions on the number of samples and the collection times, thus bypassing the need to describe the sampling procedure. These two models have been applied to different contexts: the BD in macroevolutionary studies of clades of species, and the KC for populations. The exception is the field of phylogenetic epidemiology which uses both models. This then asks the question of how such different models can be used in the same context. In this paper, we study large-population limits of the BD, in a search for a mathematical link between the BD and the KC. We show that the KC is the large-population limit of a BD conditioned on a given population trajectory, and we provide the formula for the parameter Î¸ of the limiting KC. This formula appears in earlier studies, but the present article is the first to show formally how the correspondence arises as a large-population limit, and that the BD needs to be conditioned for the KC to arise. Besides these fundamentally mathematical results, we demonstrate how our findings can be used practically in phylogenetic inference. In particular, we propose a new method for phylogenetic epidemiology, called CalicoBird, ensuing from our results. We conjecture that this new method, used in conjunction with auxiliary data (e.g. prevalence or incidence data), should allow estimating important epidemiological parameters (e.g. the prevalence and the effective reproduction number), in a way that is robust to the data-generating model and the sampling procedure. Future studies will be needed to put our claims to the test.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"622-638"},"PeriodicalIF":5.7,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144258976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PhyloCNN: Improving tree representation and neural network architecture for deep learning from trees in phylodynamics and diversification studies PhyloCNN：改进树表示和神经网络架构，用于系统动力学和多样化研究中的树的深度学习

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-22 DOI: 10.1093/sysbio/syaf082

Manolo Fernandez Perez, Olivier Gascuel

Phylodynamics and diversification studies using complex evolutionary models can be challenging, especially with traditional likelihood-based approaches. As an alternative, likelihood-free simulation-based approaches have been proposed due to their ability to incorporate complex models and scenarios. Here, we propose a new simulation-based deep learning (DL) method capable of selecting birth-death models and accurately estimating their parameters in both phylodynamics and diversification studies. We use a convolutional approach, where trees are encoded using the neighborhood of all nodes and leaves of the input phylogeny. We also developed a dedicated neural network architecture called PhyloCNN. Using simulations, we compared the accuracy of PhyloCNN when using a variable number of neighbors to describe the local context of nodes and leaves. The number of neighbors had a greater impact when considering smaller training sets, with a broader context showing higher accuracy, especially for complex evolutionary models. Compared to other recently developed DL approaches, PhyloCNN showed higher or similar accuracies for all parameters when used with training sets one or two orders of magnitude smaller (10,000 to 100,000 simulated training trees, instead of millions). PhyloCNN also compared favorably with state-of-the-art likelihood-based methods. We applied PhyloCNN with compelling results to two real-world phylodynamics and diversification datasets, related to HIV superspreaders in Zurich and to primates and their ecological role as seed dispersers. The high accuracy and computational efficiency of PhyloCNN opens new possibilities for phylodynamics and diversification studies that need to account for idiosyncratic phylogenetic histories with specific parameter spaces and sampling scenarios.

使用复杂进化模型的系统动力学和多样化研究可能具有挑战性，特别是传统的基于可能性的方法。作为一种替代方案，基于无似然的模拟方法由于能够结合复杂的模型和场景而被提出。在这里，我们提出了一种新的基于模拟的深度学习（DL）方法，能够在系统动力学和多样化研究中选择出生-死亡模型并准确估计其参数。我们使用卷积方法，其中使用输入系统发育的所有节点和叶子的邻域对树进行编码。我们还开发了一个专用的神经网络架构，叫做PhyloCNN。通过模拟，我们比较了PhyloCNN在使用可变数量的邻居来描述节点和叶子的局部上下文时的准确性。当考虑较小的训练集时，邻居的数量有更大的影响，在更广泛的背景下显示更高的准确性，特别是对于复杂的进化模型。与其他最近开发的深度学习方法相比，PhyloCNN在与小一到两个数量级的训练集（1万到10万棵模拟训练树，而不是数百万棵）一起使用时，对所有参数显示出更高或相似的准确性。PhyloCNN也与最先进的基于似然的方法相比较。我们将PhyloCNN应用于两个现实世界的系统动力学和多样化数据集，这些数据集与苏黎世的HIV超级传播者和灵长类动物及其作为种子传播者的生态作用有关。PhyloCNN的高精度和计算效率为系统动力学和多样化研究开辟了新的可能性，这些研究需要考虑具有特定参数空间和采样场景的特殊系统发育历史。

{"title":"PhyloCNN: Improving tree representation and neural network architecture for deep learning from trees in phylodynamics and diversification studies","authors":"Manolo Fernandez Perez, Olivier Gascuel","doi":"10.1093/sysbio/syaf082","DOIUrl":"https://doi.org/10.1093/sysbio/syaf082","url":null,"abstract":"Phylodynamics and diversification studies using complex evolutionary models can be challenging, especially with traditional likelihood-based approaches. As an alternative, likelihood-free simulation-based approaches have been proposed due to their ability to incorporate complex models and scenarios. Here, we propose a new simulation-based deep learning (DL) method capable of selecting birth-death models and accurately estimating their parameters in both phylodynamics and diversification studies. We use a convolutional approach, where trees are encoded using the neighborhood of all nodes and leaves of the input phylogeny. We also developed a dedicated neural network architecture called PhyloCNN. Using simulations, we compared the accuracy of PhyloCNN when using a variable number of neighbors to describe the local context of nodes and leaves. The number of neighbors had a greater impact when considering smaller training sets, with a broader context showing higher accuracy, especially for complex evolutionary models. Compared to other recently developed DL approaches, PhyloCNN showed higher or similar accuracies for all parameters when used with training sets one or two orders of magnitude smaller (10,000 to 100,000 simulated training trees, instead of millions). PhyloCNN also compared favorably with state-of-the-art likelihood-based methods. We applied PhyloCNN with compelling results to two real-world phylodynamics and diversification datasets, related to HIV superspreaders in Zurich and to primates and their ecological role as seed dispersers. The high accuracy and computational efficiency of PhyloCNN opens new possibilities for phylodynamics and diversification studies that need to account for idiosyncratic phylogenetic histories with specific parameter spaces and sampling scenarios.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"14 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145567391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Massive Inter-species Introgression Overwhelms Phylogenomic Relationships Among Jaguar, Lion, and Leopard. 美洲虎、狮子和豹的系统发育关系被物种间的大规模引种压垮了。

IF 5.7 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-22 DOI: 10.1093/sysbio/syaf021

Sarah H D Santos, Henrique V Figueiró, Tomas Flouri, Emiliano Ramalho, Laury Cullen, Ziheng Yang, William J Murphy, Eduardo Eizirik

Phylogenomic analyses of closely related species allow important glimpses into their evolutionary history. Although recent studies have demonstrated that inter-species hybridization has occurred in several groups, incorporating this process in phylogenetic reconstruction remains challenging. Specifically, the most predominant topology across the genome is often assumed to reflect the speciation tree, but rampant hybridization might overwhelm the genomes, causing that assumption to be violated. The notoriously challenging phylogeny of the 5 extant Panthera species (specifically jaguar [P. onca], lion [P. leo], and leopard [P. pardus]) is an interesting system to address this problem. Here we employed a Panthera-wide whole-genome-sequence data set incorporating 3 jaguar genomes and 2 representatives of lions and leopards to dissect the relationships among these 3 species. Maximum-likelihood trees reconstructed from non-overlapping genomic fragments of 4 different sizes strongly supported the monophyly of all 3 species. The most frequent topology (76-95%) united lion + leopard as a sister species (topology 1), followed by lion + jaguar (topology 2: 4-8%) and leopard + jaguar (topology 3: 0-6%). Topology 1 was dominant across the genome, especially in high-recombination regions. Topologies 2 and 3 were enriched in low-recombination segments, likely reflecting the species tree in the face of hybridization. Divergence times between sister species of each topology, corrected for local-recombination-rate effects, indicated that the lion-leopard divergence was significantly younger than the alternatives, likely driven by post-speciation admixture. Introgression analyses detected pervasive hybridization between lions and leopards, regardless of the assumed species tree. This inference was strongly supported by multispecies-coalescence-with-introgression analyses, which rejected topology 1 (lion+leopard) or any model without introgression. Interestingly, topologies 2 (lion+jaguar) and 3 (jaguar+leopard) with extensive lion-leopard introgression were unidentifiable, highlighting the complexity of this phylogenetic problem. Our results suggest that the dominant genome-wide tree topology is not the true species tree but rather a consequence of overwhelming post-speciation admixture between lion and leopard.

对近亲物种的系统基因组分析使我们得以一瞥它们的进化史。尽管最近的研究表明，在几个群体中已经发生了种间杂交，但将这一过程纳入系统发育重建仍然具有挑战性。具体来说，基因组中最主要的拓扑结构通常被认为反映了物种形成树，但猖獗的杂交可能会压倒基因组，导致这种假设被打破。现存的五种豹属动物（特别是美洲虎）的系统发育是出了名的具有挑战性。onca, lion [P]；leo[狮子座]和leopard[豹]。Pardus])是解决这个问题的一个有趣的系统。在这里，我们使用了一个全豹全基因组序列数据集，包括三个美洲虎基因组和两个狮子和豹子的代表来剖析这三个物种之间的关系。从四个不同大小的非重叠基因组片段重建的最大似然树强烈支持这三个物种的单系性。最常见的是狮子+豹为姐妹种（拓扑1），其次是狮子+美洲虎（拓扑2:4-8%）和豹+美洲虎（拓扑3:0-6%）。拓扑1在整个基因组中占主导地位，特别是在高重组区域。拓扑2和3富含低重组片段，可能反映了面对杂交的物种树。在校正了局部重组率效应后，每种拓扑结构的姊妹物种之间的分化时间表明，狮豹的分化明显比其他物种更年轻，这可能是由物种形成后的混合驱动的。渗入分析发现狮子和豹子之间普遍存在杂交，而不管假设的物种树是什么。这一推论得到了多物种聚结与渐渗（MSci）分析的有力支持，该分析拒绝了拓扑1或任何没有渐渗的模型。有趣的是，具有广泛狮豹渗透的拓扑2和3是无法识别的，突出了这个系统发育问题的复杂性。我们的研究结果表明，占主导地位的全基因组树拓扑结构并不是真正的物种树，而是狮子和豹在物种形成后大量混合的结果。

{"title":"Massive Inter-species Introgression Overwhelms Phylogenomic Relationships Among Jaguar, Lion, and Leopard.","authors":"Sarah H D Santos, Henrique V Figueiró, Tomas Flouri, Emiliano Ramalho, Laury Cullen, Ziheng Yang, William J Murphy, Eduardo Eizirik","doi":"10.1093/sysbio/syaf021","DOIUrl":"10.1093/sysbio/syaf021","url":null,"abstract":"Phylogenomic analyses of closely related species allow important glimpses into their evolutionary history. Although recent studies have demonstrated that inter-species hybridization has occurred in several groups, incorporating this process in phylogenetic reconstruction remains challenging. Specifically, the most predominant topology across the genome is often assumed to reflect the speciation tree, but rampant hybridization might overwhelm the genomes, causing that assumption to be violated. The notoriously challenging phylogeny of the 5 extant Panthera species (specifically jaguar [P. onca], lion [P. leo], and leopard [P. pardus]) is an interesting system to address this problem. Here we employed a Panthera-wide whole-genome-sequence data set incorporating 3 jaguar genomes and 2 representatives of lions and leopards to dissect the relationships among these 3 species. Maximum-likelihood trees reconstructed from non-overlapping genomic fragments of 4 different sizes strongly supported the monophyly of all 3 species. The most frequent topology (76-95%) united lion + leopard as a sister species (topology 1), followed by lion + jaguar (topology 2: 4-8%) and leopard + jaguar (topology 3: 0-6%). Topology 1 was dominant across the genome, especially in high-recombination regions. Topologies 2 and 3 were enriched in low-recombination segments, likely reflecting the species tree in the face of hybridization. Divergence times between sister species of each topology, corrected for local-recombination-rate effects, indicated that the lion-leopard divergence was significantly younger than the alternatives, likely driven by post-speciation admixture. Introgression analyses detected pervasive hybridization between lions and leopards, regardless of the assumed species tree. This inference was strongly supported by multispecies-coalescence-with-introgression analyses, which rejected topology 1 (lion+leopard) or any model without introgression. Interestingly, topologies 2 (lion+jaguar) and 3 (jaguar+leopard) with extensive lion-leopard introgression were unidentifiable, highlighting the complexity of this phylogenetic problem. Our results suggest that the dominant genome-wide tree topology is not the true species tree but rather a consequence of overwhelming post-speciation admixture between lion and leopard.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"583-599"},"PeriodicalIF":5.7,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143701546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Waves of Colonization and Gene Flow in a Great Speciator. 一个伟大物种的殖民浪潮和基因流动。

IF 5.7 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-22 DOI: 10.1093/sysbio/syaf023

Ethan F Gyllenhaal, Serina S Brady, Lucas H DeCicco, Alivereti Naikatini, Paul M Hime, Joseph D Manthey, John Kelly, Robert G Moyle, Michael J Andersen

Secondary contact between previously allopatric lineages offers a test of reproductive isolating mechanisms that may have accrued in isolation. Such instances of contact can produce stable hybrid zones-where reproductive isolation can further develop via reinforcement or phenotypic displacement-or result in the lineages merging. Ongoing secondary contact is most visible in continental systems, where steady input from parental taxa can occur readily. In oceanic island systems, however, secondary contact between closely related species of birds is relatively rare. When observed on sufficiently small islands, relative to population size, secondary contact likely represents a recent phenomenon. Here, we examine the dynamics of a group of birds whose apparent widespread hybridization influenced Ernst Mayr's foundational work on allopatric speciation: the whistlers of Fiji (Aves: Pachycephala). We demonstrate 2 clear instances of secondary contact within the Fijian archipelago, one resulting in a hybrid zone on a larger island, and the other resulting in a wholly admixed population on a smaller island. We leveraged low genome-wide divergence in the hybrid zone to pinpoint a single genomic region associated with observed phenotypic differences. We use genomic data to present a new hypothesis that emphasizes rapid plumage evolution and post-divergence gene flow.

以前的异域谱系之间的二次接触提供了对可能在隔离中积累的生殖隔离机制的检验。这种接触可以产生稳定的杂交区——在那里，生殖隔离可以通过强化或表型位移进一步发展——或者导致谱系合并。持续的二次接触在大陆系统中最为明显，在那里亲本类群的稳定输入很容易发生。然而，在海洋岛屿系统中，近亲鸟类之间的二次接触相对较少。当在足够小的岛屿上观察到，相对于人口规模，二次接触可能代表最近的现象。在这里，我们研究了一组鸟类的动力学，它们明显的广泛杂交影响了恩斯特·迈尔关于异域物种形成的基础工作：斐济的口哨鸟（鸟类：厚头鸟）。我们展示了斐济群岛内两次明显的二次接触实例，一次导致较大岛屿上的杂交区，另一次导致较小岛屿上的完全混合人口。我们利用杂交区的低全基因组差异来确定与观察到的表型差异相关的单个基因组区域。我们利用基因组数据提出了一个新的假设，强调羽毛的快速进化和后分化基因流。

{"title":"Waves of Colonization and Gene Flow in a Great Speciator.","authors":"Ethan F Gyllenhaal, Serina S Brady, Lucas H DeCicco, Alivereti Naikatini, Paul M Hime, Joseph D Manthey, John Kelly, Robert G Moyle, Michael J Andersen","doi":"10.1093/sysbio/syaf023","DOIUrl":"10.1093/sysbio/syaf023","url":null,"abstract":"Secondary contact between previously allopatric lineages offers a test of reproductive isolating mechanisms that may have accrued in isolation. Such instances of contact can produce stable hybrid zones-where reproductive isolation can further develop via reinforcement or phenotypic displacement-or result in the lineages merging. Ongoing secondary contact is most visible in continental systems, where steady input from parental taxa can occur readily. In oceanic island systems, however, secondary contact between closely related species of birds is relatively rare. When observed on sufficiently small islands, relative to population size, secondary contact likely represents a recent phenomenon. Here, we examine the dynamics of a group of birds whose apparent widespread hybridization influenced Ernst Mayr's foundational work on allopatric speciation: the whistlers of Fiji (Aves: Pachycephala). We demonstrate 2 clear instances of secondary contact within the Fijian archipelago, one resulting in a hybrid zone on a larger island, and the other resulting in a wholly admixed population on a smaller island. We leveraged low genome-wide divergence in the hybrid zone to pinpoint a single genomic region associated with observed phenotypic differences. We use genomic data to present a new hypothesis that emphasizes rapid plumage evolution and post-divergence gene flow.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"513-525"},"PeriodicalIF":5.7,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144038453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction to: A Phylogenetic Approach to Delimitate Species in a Probabilistic Way. 更正：用概率方法划分物种的系统发育方法。

IF 5.7 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-18 DOI: 10.1093/sysbio/syaf035

引用次数: 0

Gene flow complicates phylogenetic inference in an archipelago radiation. 基因流动使群岛辐射的系统发育推理复杂化。

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-12 DOI: 10.1093/sysbio/syaf081

Ethan F Gyllenhaal,Lukas B Klicka,Lucas H DeCicco,Brian C Weeks,Robert G Moyle,Michael J Andersen

Allopatric divergence is a fundamental component of most traditional models of biogeography and community assembly. Gene flow between allopatric populations should be influenced by the nature of geographic barriers and can have a profound impact on adaptation, the speciation process, and phylogenetic inference. Superspecies-monophyletic groups of taxa with species-level differences in phenotype or genotype that are found exclusively in allopatry or parapatry-present an opportunity to characterize the effects of gene flow on the divergence process. Here we investigate patterns of gene flow, population structure, and inferred phylogenetic relationships for members of an avian superspecies, the Solomons Monarchs (Aves: Symposiachrus barbatus complex) occupying the Solomon Islands. We found that gene flow among allopatric species matches predictions based on geography, but phylogenetic relationships were not concordant with the most likely colonization history based on a stepping-stone colonization model. Notably, the most isolated island, Makira, has a species that was inferred to be sister to the taxa on all other islands in concatenated phylogenetic analyses, despite Makira being farthest from the presumed original source of immigrants. We use population genetic simulations to demonstrate that such a result could be driven by bias resulting from low levels of gene flow, reflecting a challenge in phylogeographic inference that results when one population is differentially isolated. These simulated findings demonstrate a distinguishability issue in phylogeographic inference, where gene flow and colonization history can be difficult to disentangle.

异域分化是大多数传统生物地理学和群落聚集模型的基本组成部分。异域种群之间的基因流动应受到地理障碍性质的影响，并可能对适应、物种形成过程和系统发育推断产生深远影响。超种（superspecies）是指在同种异种或准同种异种中具有物种水平表型或基因型差异的单系类群，它为描述基因流动对分化过程的影响提供了机会。在这里，我们研究了占据所罗门群岛的一种鸟类超级物种所罗门帝王蝶（鸟类：Symposiachrus barbatus complex）的基因流动模式、种群结构和推断的系统发育关系。我们发现异域物种之间的基因流动与基于地理的预测相匹配，但系统发育关系与基于踏脚石殖民模型的最可能的殖民历史不一致。值得注意的是，在最孤立的马基拉岛上，有一个物种被推断为所有其他岛屿上分类群的姐妹，尽管马基拉岛离假定的原始移民来源最远。我们使用群体遗传模拟来证明，这样的结果可能是由低水平基因流导致的偏差所驱动的，这反映了当一个群体被差异隔离时，在系统地理推断中所面临的挑战。这些模拟结果表明，在系统地理推断中，基因流动和殖民历史很难分开，这是一个可区分性问题。

{"title":"Gene flow complicates phylogenetic inference in an archipelago radiation.","authors":"Ethan F Gyllenhaal,Lukas B Klicka,Lucas H DeCicco,Brian C Weeks,Robert G Moyle,Michael J Andersen","doi":"10.1093/sysbio/syaf081","DOIUrl":"https://doi.org/10.1093/sysbio/syaf081","url":null,"abstract":"Allopatric divergence is a fundamental component of most traditional models of biogeography and community assembly. Gene flow between allopatric populations should be influenced by the nature of geographic barriers and can have a profound impact on adaptation, the speciation process, and phylogenetic inference. Superspecies-monophyletic groups of taxa with species-level differences in phenotype or genotype that are found exclusively in allopatry or parapatry-present an opportunity to characterize the effects of gene flow on the divergence process. Here we investigate patterns of gene flow, population structure, and inferred phylogenetic relationships for members of an avian superspecies, the Solomons Monarchs (Aves: Symposiachrus barbatus complex) occupying the Solomon Islands. We found that gene flow among allopatric species matches predictions based on geography, but phylogenetic relationships were not concordant with the most likely colonization history based on a stepping-stone colonization model. Notably, the most isolated island, Makira, has a species that was inferred to be sister to the taxa on all other islands in concatenated phylogenetic analyses, despite Makira being farthest from the presumed original source of immigrants. We use population genetic simulations to demonstrate that such a result could be driven by bias resulting from low levels of gene flow, reflecting a challenge in phylogeographic inference that results when one population is differentially isolated. These simulated findings demonstrate a distinguishability issue in phylogeographic inference, where gene flow and colonization history can be difficult to disentangle.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"105 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145491556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incorporating continuous characters in joint estimation of dicynodont phylogeny 结合连续特征的双齿兽系统发育联合估计

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-11 DOI: 10.1093/sysbio/syaf078

Brenen M Wynd, Basanta Khakurel, Christian F Kammerer, Peter J Wagner, April M Wright

Continuous characters have received comparatively little attention in Bayesian phylogenetic estimation. This is predominantly because they cannot be modeled by a standard phylogenetic Q-matrix approach due to their non-discrete nature. In this paper, we explore the use of continuous traits under two Brownian motion models to estimate a phylogenetic tree for Dicynodontia, a well-studied group of early synapsids (stem mammals) in which both discrete and continuous characters have been extensively used in parsimony-based tree reconstruction. We examine the differences in phylogenetic signal between a continuous trait partition, a discrete trait partition, and a joint analysis with both types of characters. We find that continuous and discrete traits contribute substantially different signal to the analysis, even when other parts of the model (clock and tree) are held constant. Tree topologies resulting from the new analyses differ strongly from the established phylogeny for dicynodonts, highlighting continued difficulty in incorporating truly continuous data in a Bayesian phylogenetic framework.

在贝叶斯系统发育估计中，连续性状得到的关注相对较少。这主要是因为由于它们的非离散性质，它们不能用标准的系统发育q矩阵方法来建模。在本文中，我们探索了在两种布朗运动模型下使用连续特征来估计Dicynodontia的系统发育树，Dicynodontia是一个被广泛研究的早期突触动物（干哺乳动物）群体，其中离散和连续特征被广泛用于基于简约的树重建。我们研究了连续性状划分、离散性状划分和两种性状联合分析的系统发育信号差异。我们发现，即使模型的其他部分（时钟和树）保持不变，连续和离散特征对分析的贡献也大不相同。新分析得出的树拓扑结构与已建立的双齿动物系统发育有很大不同，这突出了在贝叶斯系统发育框架中纳入真正连续数据的持续困难。

引用次数: 0

Efficient Inference of Macrophylogenies: Insights from the Avian Tree of Life 大系统发生的有效推断：来自鸟类生命之树的见解

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-11-08 DOI: 10.1093/sysbio/syaf080

Min Zhao, Gregory Thom, Brant C Faircloth, Michael J Andersen, F Keith Barker, Brett W Benz, Michael J Braun, Gustavo A Bravo, Robb T Brumfield, R Terry Chesser, Elizabeth P Derryberry, Travis C Glenn, Michael G Harvey, Peter A Hosner, Tyler S Imfeld, Leo Joseph, Joseph D Manthey, John E McCormack, Jenna M McCullough, Robert G Moyle, Carl H Oliveros, Noor D White Carreiro, Kevin Winker, Daniel J Field, Daniel T Ksepka, Edward L Braun, Rebecca T Kimball, Brian Tilston Smith

The exponential growth of molecular sequence data over the past decade has enabled the construction of numerous clade-specific phylogenies encompassing hundreds or thousands of taxa. These independent studies often include overlapping data, presenting a unique opportunity to build macrophylogenies (phylogenies sampling > 1,000 taxa) for entire classes across the Tree of Life. However, the inference of large trees remains constrained by logistical, computational, and methodological challenges. The Avian Tree of Life provides an ideal model for evaluating strategies to robustly infer macrophylogenies from intersecting datasets derived from smaller studies. In this study, we leveraged a comprehensive resource of sequence capture datasets to evaluate the phylogenetic accuracy and computational costs of four methodological approaches: (1) supermatrix approaches using concatenation, including the “fast” maximum likelihood (ML) methods, (2) filtering datasets to reduce heterogeneity, (3) supertree estimation based on published phylogenomic trees, and (4) a “divide-and-conquer” strategy, wherein smaller ML trees were estimated and subsequently combined using a supertree approach. Additionally, we examined the impact of these methods on divergence time estimation using a dataset that includes newly vetted fossil calibrations for the Avian Tree of Life. Our findings highlight the advantages of recently developed fast tree search approaches initiated with parsimony starting trees, which offer a reasonable compromise between computational efficiency and phylogenetic accuracy, facilitating inference of macrophylogenies.

在过去的十年中，分子序列数据的指数级增长使得构建包含数百或数千个分类群的许多枝特异性系统发育成为可能。这些独立的研究通常包括重叠的数据，提供了一个独特的机会来建立整个生命之树的整个类的大系统发生（系统发生抽样1000个分类群）。然而，大型树木的推断仍然受到逻辑、计算和方法挑战的限制。鸟类生命之树为评估策略提供了一个理想的模型，可以从来自较小研究的交叉数据集中可靠地推断出大系统发育。在本研究中，我们利用序列捕获数据集的综合资源来评估四种方法的系统发育准确性和计算成本：(1)使用串联的超矩阵方法，包括“快速”最大似然（ML）方法；(2)过滤数据集以减少异质性；(3)基于已发表的系统基因组树的超树估计；(4)“分而治之”策略，其中估计较小的ML树，随后使用超树方法进行组合。此外，我们还使用了一个数据集来检验这些方法对发散时间估计的影响，该数据集包括新近审查的鸟类生命之树的化石校准。我们的研究结果突出了最近开发的以简约起始树为基础的快速树搜索方法的优势，它在计算效率和系统发生准确性之间提供了合理的折衷，有助于推断大系统发生。

{"title":"Efficient Inference of Macrophylogenies: Insights from the Avian Tree of Life","authors":"Min Zhao, Gregory Thom, Brant C Faircloth, Michael J Andersen, F Keith Barker, Brett W Benz, Michael J Braun, Gustavo A Bravo, Robb T Brumfield, R Terry Chesser, Elizabeth P Derryberry, Travis C Glenn, Michael G Harvey, Peter A Hosner, Tyler S Imfeld, Leo Joseph, Joseph D Manthey, John E McCormack, Jenna M McCullough, Robert G Moyle, Carl H Oliveros, Noor D White Carreiro, Kevin Winker, Daniel J Field, Daniel T Ksepka, Edward L Braun, Rebecca T Kimball, Brian Tilston Smith","doi":"10.1093/sysbio/syaf080","DOIUrl":"https://doi.org/10.1093/sysbio/syaf080","url":null,"abstract":"The exponential growth of molecular sequence data over the past decade has enabled the construction of numerous clade-specific phylogenies encompassing hundreds or thousands of taxa. These independent studies often include overlapping data, presenting a unique opportunity to build macrophylogenies (phylogenies sampling &gt; 1,000 taxa) for entire classes across the Tree of Life. However, the inference of large trees remains constrained by logistical, computational, and methodological challenges. The Avian Tree of Life provides an ideal model for evaluating strategies to robustly infer macrophylogenies from intersecting datasets derived from smaller studies. In this study, we leveraged a comprehensive resource of sequence capture datasets to evaluate the phylogenetic accuracy and computational costs of four methodological approaches: (1) supermatrix approaches using concatenation, including the “fast” maximum likelihood (ML) methods, (2) filtering datasets to reduce heterogeneity, (3) supertree estimation based on published phylogenomic trees, and (4) a “divide-and-conquer” strategy, wherein smaller ML trees were estimated and subsequently combined using a supertree approach. Additionally, we examined the impact of these methods on divergence time estimation using a dataset that includes newly vetted fossil calibrations for the Avian Tree of Life. Our findings highlight the advantages of recently developed fast tree search approaches initiated with parsimony starting trees, which offer a reasonable compromise between computational efficiency and phylogenetic accuracy, facilitating inference of macrophylogenies.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"216 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0