首页 > 最新文献

Systematic Biology最新文献

英文 中文
Integrating genomics and biogeography to unravel the origin of a mountain biota: The case of a reptile endemicity hotspot in Arabia. 整合基因组学和生物地理学,揭示山区生物群的起源:阿拉伯爬行动物特有性热点地区的案例。
IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Pub Date : 2024-07-02 DOI: 10.1093/sysbio/syae032
Bernat Burriel-Carranza, Héctor Tejero-Cicuéndez, Albert Carné, Gabriel Mochales-Riaño, Adrián Talavera, Saleh Al Saadi, Johannes Els, Jiří Šmíd, Karin Tamar, Pedro Tarroso, Salvador Carranza

Advances in genomics have greatly enhanced our understanding of mountain biodiversity, providing new insights into the complex and dynamic mechanisms that drive the formation of mountain biotas. These span from broad biogeographic patterns to population dynamics and adaptations to these environments. However, significant challenges remain in integrating large-scale and fine-scale findings to develop a comprehensive understanding of mountain biodiversity. One significant challenge is the lack of genomic data, particularly in historically understudied arid regions where reptiles are a particularly diverse vertebrate group. In the present study, we assembled a de novo genome-wide SNP dataset for the complete endemic reptile fauna of a mountain range (19 described species with more than 600 specimens sequenced), and integrated state-of-the-art biogeographic analyses at the population, species, and community level. Thus, we provide a holistic integration of how a whole endemic reptile community has originated, diversified and dispersed through a mountain system. Our results show that reptiles independently colonized the Hajar Mountains of southeastern Arabia 11 times. After colonization, species delimitation methods suggest high levels of within-mountain diversification, supporting up to 49 deep lineages. This diversity is strongly structured following local topography, with the highest peaks acting as a broad barrier to gene flow among the entire community. Interestingly, orogenic events do not seem key drivers of the biogeographic history of reptiles in this system. Instead, past climatic events seem to have had a major role in this community assemblage. We observe an increase of vicariant events from Late Pliocene onwards, coinciding with an unstable climatic period of rapid shifts between hyper-arid and semiarid conditions that led to the ongoing desertification of Arabia. We conclude that paleoclimate, and particularly extreme aridification, acted as a main driver of diversification in arid mountain systems which is tangled with the generation of highly adapted endemicity. Overall, our study does not only provide a valuable contribution to understanding the evolution of mountain biodiversity, but also offers a flexible and scalable approach that can be reproduced into any taxonomic group and at any discrete environment.

基因组学的进步极大地促进了我们对山区生物多样性的了解,为我们提供了对推动山区生物群落形成的复杂动态机制的新见解。这些机制包括广泛的生物地理格局、种群动态以及对这些环境的适应。然而,在整合大规模和精细规模的研究结果以全面了解山区生物多样性方面仍存在重大挑战。其中一个重大挑战是缺乏基因组数据,尤其是在历史上研究不足的干旱地区,而爬行动物是一个特别多样化的脊椎动物类群。在本研究中,我们为一个山脉的完整特有爬行动物群(19 个描述物种,600 多个标本测序)建立了一个全新的全基因组 SNP 数据集,并在种群、物种和群落层面整合了最先进的生物地理学分析。因此,我们对整个特有爬行动物群落如何在山脉系统中起源、多样化和扩散进行了整体整合。我们的研究结果表明,爬行动物在阿拉伯东南部的哈杰尔山脉独立殖民了 11 次。定殖之后,物种划分方法表明,山地内的多样性水平很高,支持多达 49 个深系。这种多样性是根据当地地形形成的,最高的山峰成为整个群落基因流动的巨大障碍。有趣的是,造山运动似乎并不是该系统中爬行动物生物地理历史的关键驱动因素。相反,过去的气候事件似乎在这一群落组合中发挥了重要作用。我们观察到,从上新世晚期开始,沧海桑田的事件越来越多,而这一时期的气候并不稳定,在极度干旱和半干旱条件之间快速转换,导致了阿拉伯地区的持续沙漠化。我们的结论是,古气候,尤其是极端干旱化,是干旱山地系统多样化的主要驱动力,它与高度适应性地方性的产生纠缠在一起。总之,我们的研究不仅为了解山区生物多样性的演变做出了宝贵贡献,而且提供了一种灵活、可扩展的方法,可用于任何分类群和任何离散环境。
{"title":"Integrating genomics and biogeography to unravel the origin of a mountain biota: The case of a reptile endemicity hotspot in Arabia.","authors":"Bernat Burriel-Carranza, Héctor Tejero-Cicuéndez, Albert Carné, Gabriel Mochales-Riaño, Adrián Talavera, Saleh Al Saadi, Johannes Els, Jiří Šmíd, Karin Tamar, Pedro Tarroso, Salvador Carranza","doi":"10.1093/sysbio/syae032","DOIUrl":"https://doi.org/10.1093/sysbio/syae032","url":null,"abstract":"<p><p>Advances in genomics have greatly enhanced our understanding of mountain biodiversity, providing new insights into the complex and dynamic mechanisms that drive the formation of mountain biotas. These span from broad biogeographic patterns to population dynamics and adaptations to these environments. However, significant challenges remain in integrating large-scale and fine-scale findings to develop a comprehensive understanding of mountain biodiversity. One significant challenge is the lack of genomic data, particularly in historically understudied arid regions where reptiles are a particularly diverse vertebrate group. In the present study, we assembled a de novo genome-wide SNP dataset for the complete endemic reptile fauna of a mountain range (19 described species with more than 600 specimens sequenced), and integrated state-of-the-art biogeographic analyses at the population, species, and community level. Thus, we provide a holistic integration of how a whole endemic reptile community has originated, diversified and dispersed through a mountain system. Our results show that reptiles independently colonized the Hajar Mountains of southeastern Arabia 11 times. After colonization, species delimitation methods suggest high levels of within-mountain diversification, supporting up to 49 deep lineages. This diversity is strongly structured following local topography, with the highest peaks acting as a broad barrier to gene flow among the entire community. Interestingly, orogenic events do not seem key drivers of the biogeographic history of reptiles in this system. Instead, past climatic events seem to have had a major role in this community assemblage. We observe an increase of vicariant events from Late Pliocene onwards, coinciding with an unstable climatic period of rapid shifts between hyper-arid and semiarid conditions that led to the ongoing desertification of Arabia. We conclude that paleoclimate, and particularly extreme aridification, acted as a main driver of diversification in arid mountain systems which is tangled with the generation of highly adapted endemicity. Overall, our study does not only provide a valuable contribution to understanding the evolution of mountain biodiversity, but also offers a flexible and scalable approach that can be reproduced into any taxonomic group and at any discrete environment.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141493496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The influence of the number of tree searches on maximum likelihood inference in phylogenomics 系统发生组学中树搜索次数对最大似然推断的影响
IF 6.5 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Pub Date : 2024-06-28 DOI: 10.1093/sysbio/syae031
Chao Liu, Xiaofan Zhou, Yuanning Li, Chris Todd Hittinger, Ronghui Pan, Jinyan Huang, Xue-xin Chen, Antonis Rokas, Yun Chen, Xing-Xing Shen
Maximum likelihood (ML) phylogenetic inference is widely used in phylogenomics. As heuristic searches most likely find suboptimal trees, it is recommended to conduct multiple (e.g., ten) tree searches in phylogenetic analyses. However, beyond its positive role, how and to what extent multiple tree searches aid ML phylogenetic inference remains poorly explored. Here, we found that a random starting tree was not as effective as the BioNJ and parsimony starting trees in inferring ML gene tree and that RAxML-NG and PhyML were less sensitive to different starting trees than IQ-TREE. We then examined the effect of the number of tree searches on ML tree inference with IQ-TREE and RAxML-NG, by running 100 tree searches on 19,414 gene alignments from 15 animal, plant, and fungal phylogenomic datasets. We found that the number of tree searches substantially impacted the recovery of the best-of-100 ML gene tree topology among 100 searches for a given ML program. In addition, all of the concatenation-based trees were topologically identical if the number of tree searches was ≥ 10. Quartet-based ASTRAL trees inferred from 1 to 80 tree searches differed topologically from those inferred from 100 tree searches for 6 /15 phylogenomic datasets. Lastly, our simulations showed that gene alignments with lower difficulty scores had a higher chance of finding the best-of-100 gene tree topology and were more likely to yield the correct trees.
最大似然(ML)系统发育推断被广泛应用于系统发生组学中。由于启发式搜索很可能找到次优树,因此建议在系统发生学分析中进行多次(如十次)树搜索。然而,除了多树搜索的积极作用外,多树搜索如何以及在多大程度上帮助了 ML 系统发育推断,目前仍未得到深入探讨。在这里,我们发现随机起始树在推断 ML 基因树方面不如 BioNJ 和解析起始树有效,RAxML-NG 和 PhyML 对不同起始树的敏感性也不如 IQ-TREE。然后,我们对来自 15 个动物、植物和真菌系统发生组数据集的 19,414 条基因排列进行了 100 次树搜索,检验了树搜索次数对 IQ-TREE 和 RAxML-NG 的 ML 树推断的影响。我们发现,对于特定的 ML 程序,树搜索的次数对 100 次搜索中最佳 ML 基因树拓扑的恢复有很大影响。此外,如果树搜索次数≥10,所有基于连接的树在拓扑上都是相同的。在 6 /15 个系统发生组数据集中,通过 1 至 80 次树搜索推断出的基于四元组的 ASTRAL 树与通过 100 次树搜索推断出的树在拓扑结构上存在差异。最后,我们的模拟结果表明,难度分数较低的基因排列有更高的几率找到百佳基因树拓扑,而且更有可能得到正确的基因树。
{"title":"The influence of the number of tree searches on maximum likelihood inference in phylogenomics","authors":"Chao Liu, Xiaofan Zhou, Yuanning Li, Chris Todd Hittinger, Ronghui Pan, Jinyan Huang, Xue-xin Chen, Antonis Rokas, Yun Chen, Xing-Xing Shen","doi":"10.1093/sysbio/syae031","DOIUrl":"https://doi.org/10.1093/sysbio/syae031","url":null,"abstract":"Maximum likelihood (ML) phylogenetic inference is widely used in phylogenomics. As heuristic searches most likely find suboptimal trees, it is recommended to conduct multiple (e.g., ten) tree searches in phylogenetic analyses. However, beyond its positive role, how and to what extent multiple tree searches aid ML phylogenetic inference remains poorly explored. Here, we found that a random starting tree was not as effective as the BioNJ and parsimony starting trees in inferring ML gene tree and that RAxML-NG and PhyML were less sensitive to different starting trees than IQ-TREE. We then examined the effect of the number of tree searches on ML tree inference with IQ-TREE and RAxML-NG, by running 100 tree searches on 19,414 gene alignments from 15 animal, plant, and fungal phylogenomic datasets. We found that the number of tree searches substantially impacted the recovery of the best-of-100 ML gene tree topology among 100 searches for a given ML program. In addition, all of the concatenation-based trees were topologically identical if the number of tree searches was ≥ 10. Quartet-based ASTRAL trees inferred from 1 to 80 tree searches differed topologically from those inferred from 100 tree searches for 6 /15 phylogenomic datasets. Lastly, our simulations showed that gene alignments with lower difficulty scores had a higher chance of finding the best-of-100 gene tree topology and were more likely to yield the correct trees.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic Character Mapping, Bayesian Model Selection, and Biosynthetic Pathways Shed New Light on the Evolution of Habitat Preference in Cyanobacteria. 随机特征映射、贝叶斯模型选择和生物合成途径为蓝藻栖息地偏好的进化提供了新线索
IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Pub Date : 2024-06-27 DOI: 10.1093/sysbio/syae025
Giorgio Bianchini, Martin Hagemann, Patricia Sánchez-Baracaldo

Cyanobacteria are the only prokaryotes to have evolved oxygenic photosynthesis paving the way for complex life. Studying the evolution and ecological niche of cyanobacteria and their ancestors is crucial for understanding the intricate dynamics of biosphere evolution. These organisms frequently deal with environmental stressors such as salinity and drought, and they employ compatible solutes as a mechanism to cope with these challenges. Compatible solutes are small molecules that help maintain cellular osmotic balance in high salinity environments, such as marine waters. Their production plays a crucial role in salt tolerance, which, in turn, influences habitat preference. Among the five known compatible solutes produced by cyanobacteria (sucrose, trehalose, glucosylglycerol, glucosylglycerate, and glycine betaine), their synthesis varies between individual strains. In this study, we work in a Bayesian stochastic mapping framework, integrating multiple sources of information about compatible solute biosynthesis in order to predict the ancestral habitat preference of Cyanobacteria. Through extensive model selection analyses and statistical tests for correlation, we identify glucosylglycerol and glucosylglycerate as the most significantly correlated with habitat preference, while trehalose exhibits the weakest correlation. Additionally, glucosylglycerol, glucosylglycerate, and glycine betaine show high loss/gain rate ratios, indicating their potential role in adaptability, while sucrose and trehalose are less likely to be lost due to their additional cellular functions. Contrary to previous findings, our analyses predict that the last common ancestor of Cyanobacteria (living at around 3180 Ma) had a 97% probability of a high salinity habitat preference and was likely able to synthesise glucosylglycerol and glucosylglycerate. Nevertheless, cyanobacteria likely colonized low-salinity environments shortly after their origin, with an 89% probability of the first cyanobacterium with low-salinity habitat preference arising prior to the Great Oxygenation Event (2460 Ma). Stochastic mapping analyses provide evidence of cyanobacteria inhabiting early marine habitats, aiding in the interpretation of the geological record. Our age estimate of ~2590 Ma for the divergence of two major cyanobacterial clades (Macro- and Microcyanobacteria) suggests that these were likely significant contributors to primary productivity in marine habitats in the lead-up to the Great Oxygenation Event, and thus played a pivotal role in triggering the sudden increase in atmospheric oxygen.

蓝藻是唯一进化出含氧光合作用的原核生物,为复杂生命铺平了道路。研究蓝藻及其祖先的进化和生态位对于了解生物圈进化的复杂动态至关重要。这些生物经常要应对盐度和干旱等环境压力,它们采用相容溶质作为应对这些挑战的机制。相容溶质是一种小分子,有助于在海水等高盐度环境中维持细胞渗透平衡。它们的产生对耐盐性起着至关重要的作用,而耐盐性反过来又会影响对栖息地的偏好。在蓝藻产生的五种已知相容溶质(蔗糖、曲哈糖、葡萄糖基甘油、葡萄糖基甘油酸和甘氨酸甜菜碱)中,不同菌株的合成情况各不相同。在本研究中,我们采用贝叶斯随机映射框架,整合了兼容溶质生物合成的多种信息来源,以预测蓝藻的祖先栖息地偏好。通过大量的模型选择分析和相关性统计检验,我们发现葡萄糖基甘油和葡萄糖基甘油酸酯与栖息地偏好的相关性最显著,而三卤糖的相关性最弱。此外,葡萄糖基甘油、葡萄糖基甘油酸和甘氨酸甜菜碱显示出较高的损失/增加比率,表明它们在适应性方面的潜在作用,而蔗糖和三卤糖由于具有额外的细胞功能,损失的可能性较小。与之前的研究结果相反,我们的分析预测蓝藻的最后一个共同祖先(生活在约 3180 马年)有 97% 的可能性偏好高盐度生境,并且很可能能够合成葡萄糖基甘油和葡萄糖基甘油酸。然而,蓝藻很可能在起源后不久就在低盐度环境中定殖,第一种具有低盐度生境偏好的蓝藻出现在大富氧事件(2460 Ma)之前的概率为 89%。随机绘图分析提供了蓝藻栖息于早期海洋生境的证据,有助于解释地质记录。我们对两大蓝藻支系(巨蓝藻和微蓝藻)分化的年龄估计为约 2590 Ma,这表明这些蓝藻支系很可能是大富氧作用之前海洋生境初级生产力的重要贡献者,因此在引发大气中氧气的突然增加方面发挥了关键作用。
{"title":"Stochastic Character Mapping, Bayesian Model Selection, and Biosynthetic Pathways Shed New Light on the Evolution of Habitat Preference in Cyanobacteria.","authors":"Giorgio Bianchini, Martin Hagemann, Patricia Sánchez-Baracaldo","doi":"10.1093/sysbio/syae025","DOIUrl":"https://doi.org/10.1093/sysbio/syae025","url":null,"abstract":"<p><p>Cyanobacteria are the only prokaryotes to have evolved oxygenic photosynthesis paving the way for complex life. Studying the evolution and ecological niche of cyanobacteria and their ancestors is crucial for understanding the intricate dynamics of biosphere evolution. These organisms frequently deal with environmental stressors such as salinity and drought, and they employ compatible solutes as a mechanism to cope with these challenges. Compatible solutes are small molecules that help maintain cellular osmotic balance in high salinity environments, such as marine waters. Their production plays a crucial role in salt tolerance, which, in turn, influences habitat preference. Among the five known compatible solutes produced by cyanobacteria (sucrose, trehalose, glucosylglycerol, glucosylglycerate, and glycine betaine), their synthesis varies between individual strains. In this study, we work in a Bayesian stochastic mapping framework, integrating multiple sources of information about compatible solute biosynthesis in order to predict the ancestral habitat preference of Cyanobacteria. Through extensive model selection analyses and statistical tests for correlation, we identify glucosylglycerol and glucosylglycerate as the most significantly correlated with habitat preference, while trehalose exhibits the weakest correlation. Additionally, glucosylglycerol, glucosylglycerate, and glycine betaine show high loss/gain rate ratios, indicating their potential role in adaptability, while sucrose and trehalose are less likely to be lost due to their additional cellular functions. Contrary to previous findings, our analyses predict that the last common ancestor of Cyanobacteria (living at around 3180 Ma) had a 97% probability of a high salinity habitat preference and was likely able to synthesise glucosylglycerol and glucosylglycerate. Nevertheless, cyanobacteria likely colonized low-salinity environments shortly after their origin, with an 89% probability of the first cyanobacterium with low-salinity habitat preference arising prior to the Great Oxygenation Event (2460 Ma). Stochastic mapping analyses provide evidence of cyanobacteria inhabiting early marine habitats, aiding in the interpretation of the geological record. Our age estimate of ~2590 Ma for the divergence of two major cyanobacterial clades (Macro- and Microcyanobacteria) suggests that these were likely significant contributors to primary productivity in marine habitats in the lead-up to the Great Oxygenation Event, and thus played a pivotal role in triggering the sudden increase in atmospheric oxygen.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylo2Vec: a vector representation for binary trees Phylo2Vec:二叉树的向量表示法
IF 6.5 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Pub Date : 2024-06-26 DOI: 10.1093/sysbio/syae030
Matthew J Penn, Neil Scheidwasser, Mark P Khurana, David A Duchêne, Christl A Donnelly, Samir Bhatt
Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with n leaves to a unique integer vector of length n − 1. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.
从生物数据中推断出的二元系统发生树是了解进化单元之间共同历史的核心。然而,推断树中潜在节点的位置需要耗费大量计算资源。最先进的方法依赖于精心设计的启发式树搜索,使用不同的数据结构来实现树的简便操作(如面向对象编程语言中的类)和可读性表示(如纽维克格式字符串)。在这里,我们介绍 Phylo2Vec,它是一种用于系统发生树的简易编码,是操作和表示系统发生树的统一方法。Phylo2Vec 可将任何具有 n 个叶子的二叉树映射为长度为 n - 1 的唯一整数向量。Phylo2Vec 有四方面的优势:i) 快速树采样;ii) 与纽尼克字符串相比,压缩树表示法;iii) 快速、明确地验证两棵二叉树在拓扑上是否相同;iv) 以非常大或非常小的跳跃系统性地穿越树空间。作为概念验证,我们使用 Phylo2Vec 对五个实际数据集进行了最大似然推断,结果表明一个简单的基于爬山的优化方案可以高效地穿越浩瀚的树空间,从随机树到最优树。
{"title":"Phylo2Vec: a vector representation for binary trees","authors":"Matthew J Penn, Neil Scheidwasser, Mark P Khurana, David A Duchêne, Christl A Donnelly, Samir Bhatt","doi":"10.1093/sysbio/syae030","DOIUrl":"https://doi.org/10.1093/sysbio/syae030","url":null,"abstract":"Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with n leaves to a unique integer vector of length n − 1. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward a semi-supervised learning approach to phylogenetic estimation. 系统发育估计的半监督学习方法。
IF 6.1 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-06-25 DOI: 10.1093/sysbio/syae029
Daniele Silvestro, Thibault Latrille, Nicolas Salamin

Models have always been central to inferring molecular evolution and to reconstructing phylogenetic trees. Their use typically involves the development of a mechanistic framework reflecting our understanding of the underlying biological processes, such as nucleotide substitu- tions, and the estimation of model parameters by maximum likelihood or Bayesian inference. However, deriving and optimizing the likelihood of the data is not always possible under complex evolutionary scenarios or even tractable for large datasets, often leading to unrealistic simplifying assumptions in the fitted models. To overcome this issue, we coupled stochastic simulations of genome evolution with a new supervised deep learning model to infer key parameters of molecular evolution. Our model is designed to directly analyze multiple sequence alignments and estimate per-site evolutionary rates and divergence, without requiring a known phylogenetic tree. The accuracy of our predictions matched that of likelihood-based phylogenetic inference, when rate heterogeneity followed a simple gamma distribution, but it strongly exceeded it under more complex patterns of rate variation, such as codon models. Our approach is highly scalable and can be efficiently applied to genomic data, as we showed on a dataset of 26 million nucleotides from the clownfish clade. Our simulations also showed that the integration of per-site rates obtained by deep learning within a Bayesian framework led to significantly more accu- rate phylogenetic inference, particularly with respect to the estimated branch lengths. We thus propose that future advancements in phylogenetic analysis will benefit from a semi-supervised learning approach that combines deep-learning estimation of substitution rates, which allows for more flexible models of rate variation, and probabilistic inference of the phylogenetic tree, which guarantees interpretability and a rigorous assessment of statistical support.

模型一直是推断分子进化和重建系统发生树的核心。使用模型通常需要建立一个机理框架,反映我们对核苷酸取代等基本生物过程的理解,并通过最大似然法或贝叶斯推断法估计模型参数。然而,在复杂的进化情况下,推导和优化数据的似然性并不总是可能的,甚至对于大型数据集来说也不是一件容易的事,这往往会导致拟合模型中出现不切实际的简化假设。为了克服这个问题,我们将基因组进化的随机模拟与新的监督深度学习模型相结合,以推断分子进化的关键参数。我们的模型旨在直接分析多序列比对,并估算每个位点的进化速率和分歧,而无需已知的系统发生树。当速率异质性遵循简单的伽马分布时,我们预测的准确性与基于似然法的系统发育推断相匹配,但在更复杂的速率变异模式(如密码子模型)下,我们预测的准确性大大超过了似然法。我们的方法具有很强的可扩展性,可以高效地应用于基因组数据,正如我们在小丑鱼支系的 2600 万核苷酸数据集上所展示的那样。我们的模拟还表明,在贝叶斯框架内整合通过深度学习获得的每个位点率,可以大大提高系统发育推断的准确率,尤其是在估计分支长度方面。因此,我们建议,未来系统发生分析的进步将受益于半监督学习方法,这种方法结合了深度学习对替代率的估计和系统发生树的概率推断,前者允许更灵活的替代率变化模型,后者保证了可解释性和对统计支持的严格评估。
{"title":"Toward a semi-supervised learning approach to phylogenetic estimation.","authors":"Daniele Silvestro, Thibault Latrille, Nicolas Salamin","doi":"10.1093/sysbio/syae029","DOIUrl":"https://doi.org/10.1093/sysbio/syae029","url":null,"abstract":"<p><p>Models have always been central to inferring molecular evolution and to reconstructing phylogenetic trees. Their use typically involves the development of a mechanistic framework reflecting our understanding of the underlying biological processes, such as nucleotide substitu- tions, and the estimation of model parameters by maximum likelihood or Bayesian inference. However, deriving and optimizing the likelihood of the data is not always possible under complex evolutionary scenarios or even tractable for large datasets, often leading to unrealistic simplifying assumptions in the fitted models. To overcome this issue, we coupled stochastic simulations of genome evolution with a new supervised deep learning model to infer key parameters of molecular evolution. Our model is designed to directly analyze multiple sequence alignments and estimate per-site evolutionary rates and divergence, without requiring a known phylogenetic tree. The accuracy of our predictions matched that of likelihood-based phylogenetic inference, when rate heterogeneity followed a simple gamma distribution, but it strongly exceeded it under more complex patterns of rate variation, such as codon models. Our approach is highly scalable and can be efficiently applied to genomic data, as we showed on a dataset of 26 million nucleotides from the clownfish clade. Our simulations also showed that the integration of per-site rates obtained by deep learning within a Bayesian framework led to significantly more accu- rate phylogenetic inference, particularly with respect to the estimated branch lengths. We thus propose that future advancements in phylogenetic analysis will benefit from a semi-supervised learning approach that combines deep-learning estimation of substitution rates, which allows for more flexible models of rate variation, and probabilistic inference of the phylogenetic tree, which guarantees interpretability and a rigorous assessment of statistical support.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards reliable detection of introgression in the presence of among-species rate variation. 在存在物种间速率变异的情况下,实现可靠的引种检测。
IF 6.1 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-06-24 DOI: 10.1093/sysbio/syae028
Thore Koppetsch, Milan Malinsky, Michael Matschiner

The role of interspecific hybridization has recently seen increasing attention, especially in the context of diversification dynamics. Genomic research has now made it abundantly clear that both hybridization and introgression - the exchange of genetic material through hybridization and backcrossing - are far more common than previously thought. Besides cases of ongoing or recent genetic exchange between taxa, an increasing number of studies report "ancient introgression" - referring to results of hybridization that took place in the distant past. However, it is not clear whether commonly used methods for the detection of introgression are applicable to such old systems, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species, affected by recent or ongoing genetic exchange. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to old systems, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we showed that some commonly applied statistical methods, including the D-statistic and certain tests based on sets of local phylogenetic trees, can produce false-positive signals of introgression between divergent taxa that have different rates of evolution. These misleading signals are caused by the presence of homoplasies occurring at different rates in different lineages. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites along the genome, and implemented this test in the program Dsuite.

近来,种间杂交的作用越来越受到关注,尤其是在物种多样化动态的背景下。基因组研究现在已经非常清楚地表明,杂交和引种--通过杂交和回交进行遗传物质交换--比以前想象的要普遍得多。除了类群之间正在进行的或最近发生的基因交流,越来越多的研究报告了 "古老的引入"--指的是远古时代发生的杂交结果。然而,目前还不清楚常用的检测引入的方法是否适用于这种古老的系统,因为这些方法最初大多是为分析受近期或正在进行的遗传交流影响的种群和新近分化的物种而开发的。特别是,许多常用方法中隐含的恒定进化速率假设,随着进化差异的增加更有可能被违反。为了检验引入检测方法在应用于旧系统时的局限性,我们模拟了数千个基因组数据集,这些数据集在各种设置下,物种间的进化率变化和引入程度各不相同。通过使用这些模拟数据集,我们发现一些常用的统计方法,包括 D 统计量和某些基于局部系统发生树的测试,会在具有不同进化速率的不同类群之间产生引入的假阳性信号。这些误导性信号是由于同源现象在不同品系中以不同的速度出现而造成的。为了区分由进化速率变化引起的模式和真正的引入,我们开发了一种新的检验方法,它基于引入位点沿基因组的预期聚类,并在 Dsuite 程序中实现了这一检验方法。
{"title":"Towards reliable detection of introgression in the presence of among-species rate variation.","authors":"Thore Koppetsch, Milan Malinsky, Michael Matschiner","doi":"10.1093/sysbio/syae028","DOIUrl":"https://doi.org/10.1093/sysbio/syae028","url":null,"abstract":"<p><p>The role of interspecific hybridization has recently seen increasing attention, especially in the context of diversification dynamics. Genomic research has now made it abundantly clear that both hybridization and introgression - the exchange of genetic material through hybridization and backcrossing - are far more common than previously thought. Besides cases of ongoing or recent genetic exchange between taxa, an increasing number of studies report \"ancient introgression\" - referring to results of hybridization that took place in the distant past. However, it is not clear whether commonly used methods for the detection of introgression are applicable to such old systems, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species, affected by recent or ongoing genetic exchange. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to old systems, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we showed that some commonly applied statistical methods, including the D-statistic and certain tests based on sets of local phylogenetic trees, can produce false-positive signals of introgression between divergent taxa that have different rates of evolution. These misleading signals are caused by the presence of homoplasies occurring at different rates in different lineages. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites along the genome, and implemented this test in the program Dsuite.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141443357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exon capture museomics deciphers the nine-banded armadillo species complex and identifies a new species endemic to the Guiana Shield. 外显子捕获 museomics 破译了九带犰狳物种复合体,并确定了圭亚那地盾特有的一个新物种。
IF 6.1 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-06-22 DOI: 10.1093/sysbio/syae027
Mathilde Barthe, Loïs Rancilhac, Maria C Arteaga, Anderson Feijó, Marie-Ka Tilak, Fabienne Justy, W J Loughry, Colleen M McDonough, Benoit de Thoisy, François Catzeflis, Guillaume Billet, Lionel Hautier, Benoit Nabholz, Frédéric Delsuc

The nine-banded armadillo (Dasypus novemcinctus) is the most widespread xenarthran species across the Americas. Recent studies have suggested it is composed of four morphologically and genetically distinct lineages of uncertain taxonomic status. To address this issue, we used a museomic approach to sequence 80 complete mitogenomes and capture 997 nuclear loci for 71 Dasypus individuals sampled across the entire distribution. We carefully cleaned up potential genotyping errors and cross contaminations that could blur species boundaries by mimicking gene flow. Our results unambiguously support four distinct lineages within the D. novemcinctus complex. We found cases of mito-nuclear phylogenetic discordance but only limited contemporary gene flow confined to the margins of the lineage distributions. All available evidence including the restricted gene flow, phylogenetic reconstructions based on both mitogenomes and nuclear loci, and phylogenetic delimitation methods consistently supported the four lineages within D. novemcinctus as four distinct species. Comparable genetic differentiation values to other recognized Dasypus species further reinforced their status as valid species. Considering congruent morphological results from previous studies, we provide an integrative taxonomic view to recognise four species within the D. novemcinctus complex: D. novemcinctus, D. fenestratus, D. mexicanus, and D. guianensis sp. nov., a new species endemic of the Guiana Shield that we describe here. The two available individuals of D. mazzai and D. sabanicola were consistently nested within D. novemcinctus lineage and their status remains to be assessed. The present work offers a case study illustrating the power of museomics to reveal cryptic species diversity within a widely distributed and emblematic species of mammals.

九带犰狳(Dasypus novemcinctus)是美洲分布最广的异种。最近的研究表明,九带犰狳由四个形态和基因上不同的品系组成,其分类地位尚不确定。为了解决这个问题,我们采用了一种 museomic 方法,对整个分布区的 71 个 Dasypus 个体进行了 80 个完整的有丝分裂基因组测序,并捕获了 997 个核基因位点。我们仔细清除了潜在的基因分型错误和交叉污染,这些错误和污染可能会通过模拟基因流模糊物种界限。我们的研究结果明确支持在 D. novemcinctus 复合物中存在四个不同的品系。我们发现了有丝分裂-核系统发育不一致的情况,但当代基因流动仅局限于各系分布的边缘。所有可用的证据,包括受限的基因流动、基于有丝分裂基因组和核基因位点的系统发育重建以及系统发育定界方法,都一致支持将新鳞蟾蜍的四个品系视为四个不同的物种。与其他公认的 Dasypus 物种相似的遗传分化值进一步加强了它们作为有效物种的地位。考虑到先前研究中一致的形态学结果,我们提供了一个综合的分类学观点,以确认新月豚复合种中的四个物种:D. novemcinctus、D. fenestratus、D. mexicanus,以及我们在此描述的圭亚那盾地区特有的新种 D. guianensis sp.现有的两个 D. mazzai 和 D. sabanicola 个体一直被归入 D. novemcinctus 系,它们的地位仍有待评估。本研究提供了一个案例研究,说明了 museomics 在广泛分布的代表性哺乳动物物种中揭示隐性物种多样性的能力。
{"title":"Exon capture museomics deciphers the nine-banded armadillo species complex and identifies a new species endemic to the Guiana Shield.","authors":"Mathilde Barthe, Loïs Rancilhac, Maria C Arteaga, Anderson Feijó, Marie-Ka Tilak, Fabienne Justy, W J Loughry, Colleen M McDonough, Benoit de Thoisy, François Catzeflis, Guillaume Billet, Lionel Hautier, Benoit Nabholz, Frédéric Delsuc","doi":"10.1093/sysbio/syae027","DOIUrl":"https://doi.org/10.1093/sysbio/syae027","url":null,"abstract":"<p><p>The nine-banded armadillo (Dasypus novemcinctus) is the most widespread xenarthran species across the Americas. Recent studies have suggested it is composed of four morphologically and genetically distinct lineages of uncertain taxonomic status. To address this issue, we used a museomic approach to sequence 80 complete mitogenomes and capture 997 nuclear loci for 71 Dasypus individuals sampled across the entire distribution. We carefully cleaned up potential genotyping errors and cross contaminations that could blur species boundaries by mimicking gene flow. Our results unambiguously support four distinct lineages within the D. novemcinctus complex. We found cases of mito-nuclear phylogenetic discordance but only limited contemporary gene flow confined to the margins of the lineage distributions. All available evidence including the restricted gene flow, phylogenetic reconstructions based on both mitogenomes and nuclear loci, and phylogenetic delimitation methods consistently supported the four lineages within D. novemcinctus as four distinct species. Comparable genetic differentiation values to other recognized Dasypus species further reinforced their status as valid species. Considering congruent morphological results from previous studies, we provide an integrative taxonomic view to recognise four species within the D. novemcinctus complex: D. novemcinctus, D. fenestratus, D. mexicanus, and D. guianensis sp. nov., a new species endemic of the Guiana Shield that we describe here. The two available individuals of D. mazzai and D. sabanicola were consistently nested within D. novemcinctus lineage and their status remains to be assessed. The present work offers a case study illustrating the power of museomics to reveal cryptic species diversity within a widely distributed and emblematic species of mammals.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141440907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Caught in the Act: Incipient Speciation at the Southern limit of Viburnum in the Central Andes. 被逮个正着:安第斯山脉中部紫云英南缘的物种萌芽。
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-06-04 DOI: 10.1093/sysbio/syae023
Carlos A Maya-Lastra, Patrick W Sweeney, Deren A R Eaton, Vania Torrez, Carla Maldonado, Malu I Ore-Rengifo, Mónica Arakaki, Michael J Donoghue, Erika J Edwards

A fundamental objective of evolutionary biology is to understand the origin of independently evolving species. Phylogenetic studies of species radiations rarely are able to document ongoing speciation; instead, modes of speciation, entailing geographic separation and/or ecological differentiation, are posited retrospectively. The Oreinotinus clade of Viburnum has radiated recently from north to south through the cloud forests of Mexico and Central America to the Central Andes. Our analyses support a hypothesis of incipient speciation in Oreinotinus at the southern edge of its geographic range, from central Peru to northern Argentina. Although several species and infraspecific taxa of have been recognized in this area, multiple lines of evidence and analytical approaches (including analyses of phylogenetic relationships, genetic structure, leaf morphology, and climatic envelopes) favor the recognition of just a single species, V. seemenii. We show that what has previously been recognized as V. seemenii f. minor has recently occupied the drier Tucuman-Bolivian forest region from Samaipata in Bolivia to Salta in northern Argentina. Plants in these populations form a well-supported clade with a distinctive genetic signature and they have evolved smaller, narrower leaves. We interpret this as the beginning of a within-species divergence process that has elsewhere in the neotropics resulted repeatedly in Viburnum species with a particular set of leaf ecomorphs. Specifically, the southern populations are in the process of evolving the small, glabrous, and entire leaf ecomorph that has evolved in four other montane areas of endemism. As predicted based on our studies of leaf ecomorphs in Chiapas, Mexico, these southern populations experience generally drier conditions, with large diurnal temperature fluctuations. In a central portion of the range of V. seemenii, characterized by wetter climatic conditions, we also document what may be the initial differentiation of the leaf ecomorph with larger, pubescent, and toothy leaves. The emergence of these ecomorphs thus appears to be driven by adaptation to subtly different climatic conditions in separate geographic regions, as opposed to parapatric differentiation along elevational gradients as suggested by Viburnum species distributions in other parts of the neotropics.

进化生物学的一个基本目标是了解独立进化物种的起源。物种辐射的系统发育研究很少能够记录正在进行的物种分化;相反,物种分化的模式,包括地理分离和/或生态分化,都是回溯性的假设。最近,紫云英的 Oreinotinus 支系从北向南辐射,穿过墨西哥和中美洲的云雾林,到达中安第斯山脉。我们的分析支持一种假设,即在秘鲁中部到阿根廷北部的 Oreinotinus 地理分布范围的南部边缘,其物种刚刚开始分化。尽管在这一地区已经发现了多个物种和种下类群,但多种证据和分析方法(包括系统发育关系、遗传结构、叶片形态学和气候包络分析)都支持只承认一个物种,即 V. seemenii。我们的研究表明,以前被认为是 V. seemenii f. minor 的物种最近占据了从玻利维亚的萨马伊帕塔到阿根廷北部萨尔塔的图库曼-玻利维亚森林地区。这些种群中的植物形成了一个具有独特遗传特征的支持良好的支系,它们进化出了更小、更窄的叶片。我们将此解释为种内分化过程的开始,这种分化过程在新热带地区的其他地方曾多次导致紫云英物种具有一套特殊的叶片非形态。具体来说,南方种群正在进化出小的、无毛的、全缘叶的叶片异形,而这种叶片异形已经在其他四个山地特有地区进化出来了。根据我们对墨西哥恰帕斯州叶片异形的研究预测,这些南部种群通常较为干燥,昼夜温度波动较大。在V. seemenii分布区的中部,气候条件较为潮湿,我们还记录了叶片的最初分化,叶片较大、有短柔毛和齿状突起。因此,这些非形态的出现似乎是为了适应不同地理区域微妙不同的气候条件,而不是像新热带地区其他地方的紫云英物种分布所表明的那样,沿着海拔梯度进行同域分化。
{"title":"Caught in the Act: Incipient Speciation at the Southern limit of Viburnum in the Central Andes.","authors":"Carlos A Maya-Lastra, Patrick W Sweeney, Deren A R Eaton, Vania Torrez, Carla Maldonado, Malu I Ore-Rengifo, Mónica Arakaki, Michael J Donoghue, Erika J Edwards","doi":"10.1093/sysbio/syae023","DOIUrl":"https://doi.org/10.1093/sysbio/syae023","url":null,"abstract":"<p><p>A fundamental objective of evolutionary biology is to understand the origin of independently evolving species. Phylogenetic studies of species radiations rarely are able to document ongoing speciation; instead, modes of speciation, entailing geographic separation and/or ecological differentiation, are posited retrospectively. The Oreinotinus clade of Viburnum has radiated recently from north to south through the cloud forests of Mexico and Central America to the Central Andes. Our analyses support a hypothesis of incipient speciation in Oreinotinus at the southern edge of its geographic range, from central Peru to northern Argentina. Although several species and infraspecific taxa of have been recognized in this area, multiple lines of evidence and analytical approaches (including analyses of phylogenetic relationships, genetic structure, leaf morphology, and climatic envelopes) favor the recognition of just a single species, V. seemenii. We show that what has previously been recognized as V. seemenii f. minor has recently occupied the drier Tucuman-Bolivian forest region from Samaipata in Bolivia to Salta in northern Argentina. Plants in these populations form a well-supported clade with a distinctive genetic signature and they have evolved smaller, narrower leaves. We interpret this as the beginning of a within-species divergence process that has elsewhere in the neotropics resulted repeatedly in Viburnum species with a particular set of leaf ecomorphs. Specifically, the southern populations are in the process of evolving the small, glabrous, and entire leaf ecomorph that has evolved in four other montane areas of endemism. As predicted based on our studies of leaf ecomorphs in Chiapas, Mexico, these southern populations experience generally drier conditions, with large diurnal temperature fluctuations. In a central portion of the range of V. seemenii, characterized by wetter climatic conditions, we also document what may be the initial differentiation of the leaf ecomorph with larger, pubescent, and toothy leaves. The emergence of these ecomorphs thus appears to be driven by adaptation to subtly different climatic conditions in separate geographic regions, as opposed to parapatric differentiation along elevational gradients as suggested by Viburnum species distributions in other parts of the neotropics.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141238062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two Notorious Nodes: A Critical Examination of Relaxed Molecular Clock Age Estimates of the Bilaterian Animals and Placental Mammals. 两个臭名昭著的节点:对两栖动物和胎生哺乳动物的松弛分子钟年龄估计的批判性审视》(Two Notorious Nodes: A Critical Examination of Relaxed Molecular Clock Age Estimates of Bilaterian Animals and Placental Mammals)。
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-27 DOI: 10.1093/sysbio/syad057
Graham E Budd, Richard P Mann

The popularity of relaxed clock Bayesian inference of clade origin timings has generated several recent publications with focal results considerably older than the fossils of the clades in question. Here, we critically examine two such clades: the animals (with a focus on the bilaterians) and the mammals (with a focus on the placentals). Each example displays a set of characteristic pathologies which, although much commented on, are rarely corrected for. We conclude that in neither case does the molecular clock analysis provide any evidence for an origin of the clade deeper than what is suggested by the fossil record. In addition, both these clades have other features (including, in the case of the placental mammals, proximity to a large mass extinction) that allow us to generate precise expectations of the timings of their origins. Thus, in these instances, the fossil record can provide a powerful test of molecular clock methodology, and why it goes astray, and we have every reason to think these problems are general. [Cambrian explosion; mammalian evolution; molecular clocks.].

放宽时钟贝叶斯推断支系起源时间的方法很受欢迎,最近发表了几篇文章,其焦点结果比相关支系的化石还要早得多。在此,我们对两个这样的支系进行了批判性的研究:兽类(重点是双翼类)和哺乳类(重点是胎生类)。每个例子都显示出一系列特征性病理现象,尽管这些病理现象已被广泛讨论,但却很少得到纠正。我们的结论是,在这两个例子中,分子钟分析都没有提供任何证据表明该支系的起源比化石记录所显示的更深。此外,这两个支系都有其他特征(包括胎盘哺乳动物接近大灭绝),使我们能够对其起源时间做出精确的预期。因此,在这些情况下,化石记录可以为分子钟方法论提供有力的检验,并说明它为什么会误入歧途;我们完全有理由认为这些问题是普遍存在的。
{"title":"Two Notorious Nodes: A Critical Examination of Relaxed Molecular Clock Age Estimates of the Bilaterian Animals and Placental Mammals.","authors":"Graham E Budd, Richard P Mann","doi":"10.1093/sysbio/syad057","DOIUrl":"10.1093/sysbio/syad057","url":null,"abstract":"<p><p>The popularity of relaxed clock Bayesian inference of clade origin timings has generated several recent publications with focal results considerably older than the fossils of the clades in question. Here, we critically examine two such clades: the animals (with a focus on the bilaterians) and the mammals (with a focus on the placentals). Each example displays a set of characteristic pathologies which, although much commented on, are rarely corrected for. We conclude that in neither case does the molecular clock analysis provide any evidence for an origin of the clade deeper than what is suggested by the fossil record. In addition, both these clades have other features (including, in the case of the placental mammals, proximity to a large mass extinction) that allow us to generate precise expectations of the timings of their origins. Thus, in these instances, the fossil record can provide a powerful test of molecular clock methodology, and why it goes astray, and we have every reason to think these problems are general. [Cambrian explosion; mammalian evolution; molecular clocks.].</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11129587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10554848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of Ghost Introgression Requires Exploiting Topological and Branch Length Information. 利用拓扑结构和分支长度信息检测鬼怪进化
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-27 DOI: 10.1093/sysbio/syad077
Xiao-Xu Pang, Da-Yong Zhang

In recent years, the study of hybridization and introgression has made significant progress, with ghost introgression-the transfer of genetic material from extinct or unsampled lineages to extant species-emerging as a key area for research. Accurately identifying ghost introgression, however, presents a challenge. To address this issue, we focused on simple cases involving 3 species with a known phylogenetic tree. Using mathematical analyses and simulations, we evaluated the performance of popular phylogenetic methods, including HyDe and PhyloNet/MPL, and the full-likelihood method, Bayesian Phylogenetics and Phylogeography (BPP), in detecting ghost introgression. Our findings suggest that heuristic approaches relying on site-pattern counts or gene-tree topologies struggle to differentiate ghost introgression from introgression between sampled non-sister species, frequently leading to incorrect identification of donor and recipient species. The full-likelihood method BPP uses multilocus sequence alignments directly-hence taking into account both gene-tree topologies and branch lengths, by contrast, is capable of detecting ghost introgression in phylogenomic datasets. We analyzed a real-world phylogenomic dataset of 14 species of Jaltomata (Solanaceae) to showcase the potential of full-likelihood methods for accurate inference of introgression.

近年来,杂交和引种研究取得了重大进展,其中鬼魂引种--已灭绝或未取样种系的遗传物质向现存物种的转移--成为一个重要的研究领域。然而,准确识别幽灵引入是一项挑战。为了解决这个问题,我们重点研究了涉及已知系统发生树的三个物种的简单案例。通过数学分析和模拟,我们评估了流行的系统发生学方法(包括 HyDe 和 PhyloNet/MPL)和全似然法(贝叶斯系统发生学和系统地理学,Bayesian Phylogenetics and Phylogeography (BPP))在检测幽灵引入方面的性能。我们的研究结果表明,依靠位点模式计数或基因树拓扑结构的启发式方法难以区分幽灵引种和取样非姊妹物种之间的引种,经常导致供体和受体物种的错误鉴定。相比之下,直接使用多焦点序列比对的全似然方法 BPP,同时考虑了基因树拓扑和分支长度,能够检测出系统发生组数据集中的幽灵引入。我们分析了现实世界中 14 种茄科植物的系统发生组数据集,以展示全似然方法在准确推断引入方面的潜力。
{"title":"Detection of Ghost Introgression Requires Exploiting Topological and Branch Length Information.","authors":"Xiao-Xu Pang, Da-Yong Zhang","doi":"10.1093/sysbio/syad077","DOIUrl":"10.1093/sysbio/syad077","url":null,"abstract":"<p><p>In recent years, the study of hybridization and introgression has made significant progress, with ghost introgression-the transfer of genetic material from extinct or unsampled lineages to extant species-emerging as a key area for research. Accurately identifying ghost introgression, however, presents a challenge. To address this issue, we focused on simple cases involving 3 species with a known phylogenetic tree. Using mathematical analyses and simulations, we evaluated the performance of popular phylogenetic methods, including HyDe and PhyloNet/MPL, and the full-likelihood method, Bayesian Phylogenetics and Phylogeography (BPP), in detecting ghost introgression. Our findings suggest that heuristic approaches relying on site-pattern counts or gene-tree topologies struggle to differentiate ghost introgression from introgression between sampled non-sister species, frequently leading to incorrect identification of donor and recipient species. The full-likelihood method BPP uses multilocus sequence alignments directly-hence taking into account both gene-tree topologies and branch lengths, by contrast, is capable of detecting ghost introgression in phylogenomic datasets. We analyzed a real-world phylogenomic dataset of 14 species of Jaltomata (Solanaceae) to showcase the potential of full-likelihood methods for accurate inference of introgression.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11129598/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139472861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Systematic Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1