Systematic Biology最新文献_第9页

Type genomics: a Framework for integrating Genomic Data into Biodiversity and Taxonomic research. 类型基因组学：将基因组数据整合到生物多样性和分类学研究中的框架。

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-20 DOI: 10.1093/sysbio/syaf040

Harald Letsch,Carola Greve,Anna K Hundsdoerfer,Iker Irisarri,Jenna M Moore,Marianne Espeland,Stefan Wanke,Umilaela Arifin,Mozes P K Blom,Carolina Corrales,Alexander Donath,Uwe Fritz,Gunther Köhler,Patrick Kück,Sarah Lemer,Ximo Mengual,Nancy Mercado Salas,Karen Meusemann,Anja Palandačić,Christian Printzen,Julia D Sigwart,Karina L Silva-Brandão,Marianna Simões,Madlen Stange,Alexander Suh,Nikolaus Szucsich,Ekin Tilic,Till Töpfer,Astrid Böhne,Axel Janke,Steffen Pauls

Name-bearing type specimens have a fundamental role in characterising biodiversity, as these objects represent the physical link between a scientific name and the biological organism. Type specimens are usually deposited in natural history collections, which provide key infrastructure for research on essential biological structures and processes, while preserving records of biodiversity for future generations. Modern systematics increasingly depends on genetic and genomic data to differentiate and characterise species. While the results of genome sequencing are often connected to a physical voucher specimen, they are rarely derived from the ultimate taxonomic reference for a species, i.e., the name-bearing type specimens. This is a known but under-appreciated problem for ensuring the replicability of findings, especially those that affect the interpretation of biodiversity distributions and phylogenetic relationships. Destructive sampling of museum specimens, particularly of type material, often carries a high risk of sequencing failure, and thus the cost of damage to the specimen may outweigh the resulting benefit. Both taxonomic work and genome sequencing require specialist skills and there are often communication gaps between the respective experts. A new, harmonised approach, maximising information extraction while minimising risk to type specimens, is a critical step forward toward linking disciplines across biodiversity research and promoting a better taxonomic and systematic understanding of eukaryotic diversity. The genetic make-up of a type specimen is a fundamental part of its biological information, which can and should be made freely and digitally available through type genomics. Here we describe guidelines for the use of nomenclatural types in genome sequencing approaches considering different kinds of types in different stages of preservation and different data types.

命名型标本在描述生物多样性方面具有根本作用，因为这些标本代表了学名与生物有机体之间的物理联系。模式标本通常存放在自然史馆藏中，为研究重要的生物结构和过程提供了重要的基础设施，同时为后代保存了生物多样性的记录。现代系统分类学越来越依赖于遗传和基因组数据来区分和表征物种。虽然基因组测序的结果通常与实物凭证标本相关联，但它们很少来自物种的最终分类参考，即命名型标本。这是一个已知但未得到充分重视的问题，因为要确保研究结果的可重复性，特别是那些影响对生物多样性分布和系统发育关系的解释的结果。对博物馆标本进行破坏性采样，特别是对类型材料进行破坏性采样，往往有很高的测序失败风险，因此对标本的破坏成本可能超过由此带来的收益。分类学工作和基因组测序都需要专业技能，而且各自的专家之间经常存在沟通缺口。一种新的协调方法，在最大限度地提取信息的同时将模式标本的风险降到最低，是将生物多样性研究的各个学科联系起来并促进对真核生物多样性更好的分类和系统理解的关键一步。模式标本的基因组成是其生物信息的基本组成部分，可以而且应该通过模式基因组学免费和数字化提供。在这里，我们描述了在考虑不同保存阶段和不同数据类型的不同类型的基因组测序方法中使用命名类型的指南。

{"title":"Type genomics: a Framework for integrating Genomic Data into Biodiversity and Taxonomic research.","authors":"Harald Letsch,Carola Greve,Anna K Hundsdoerfer,Iker Irisarri,Jenna M Moore,Marianne Espeland,Stefan Wanke,Umilaela Arifin,Mozes P K Blom,Carolina Corrales,Alexander Donath,Uwe Fritz,Gunther Köhler,Patrick Kück,Sarah Lemer,Ximo Mengual,Nancy Mercado Salas,Karen Meusemann,Anja Palandačić,Christian Printzen,Julia D Sigwart,Karina L Silva-Brandão,Marianna Simões,Madlen Stange,Alexander Suh,Nikolaus Szucsich,Ekin Tilic,Till Töpfer,Astrid Böhne,Axel Janke,Steffen Pauls","doi":"10.1093/sysbio/syaf040","DOIUrl":"https://doi.org/10.1093/sysbio/syaf040","url":null,"abstract":"Name-bearing type specimens have a fundamental role in characterising biodiversity, as these objects represent the physical link between a scientific name and the biological organism. Type specimens are usually deposited in natural history collections, which provide key infrastructure for research on essential biological structures and processes, while preserving records of biodiversity for future generations. Modern systematics increasingly depends on genetic and genomic data to differentiate and characterise species. While the results of genome sequencing are often connected to a physical voucher specimen, they are rarely derived from the ultimate taxonomic reference for a species, i.e., the name-bearing type specimens. This is a known but under-appreciated problem for ensuring the replicability of findings, especially those that affect the interpretation of biodiversity distributions and phylogenetic relationships. Destructive sampling of museum specimens, particularly of type material, often carries a high risk of sequencing failure, and thus the cost of damage to the specimen may outweigh the resulting benefit. Both taxonomic work and genome sequencing require specialist skills and there are often communication gaps between the respective experts. A new, harmonised approach, maximising information extraction while minimising risk to type specimens, is a critical step forward toward linking disciplines across biodiversity research and promoting a better taxonomic and systematic understanding of eukaryotic diversity. The genetic make-up of a type specimen is a fundamental part of its biological information, which can and should be made freely and digitally available through type genomics. Here we describe guidelines for the use of nomenclatural types in genome sequencing approaches considering different kinds of types in different stages of preservation and different data types.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"55 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144103601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Mkv Model with Among-Character Rate Variation 特征间速率变化的Mkv模型

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-16 DOI: 10.1093/sysbio/syaf038

Alessio Capobianco, Sebastian Höhna

Models used in likelihood-based morphological phylogenetics often adapt molecular phylogenetics models to the specificities of morphological data. Such is the case for the widely used Mkv model—which introduces an acquisition bias correction for sampling only characters that are observed to be variable—and for models of among-character rate variation (ACRV), routinely applied by researchers to relax the equal-rates assumption of Mkv. However, the interaction between variable character acquisition bias and ACRV has never been explored before. We demonstrate that there are two distinct approaches to condition the likelihood on variable characters when there is ACRV, and we call them joint and marginal acquisition bias. Far from being just a trivial mathematical detail, we show that the way in which the variable character conditional likelihood is calculated results in different assumptions about how rate variation is distributed in morphological datasets. Simulations demonstrate that tree length and amount of ACRV in the data are systematically biased when conditioning on variable characters differently from how the data was simulated. Moreover, an empirical case study with extant and extinct taxa reveals a potential impact not only on the estimation of branch lengths, but also of phylogenetic relationships. We recommend the use of the marginal acquisition bias approach for morphological datasets modeled with ACRV. Finally, we urge developers of phylogenetic software to clarify which acquisition bias correction is implemented for both estimation and simulation, and we discuss the implications of our findings on modeling variable characters for the future of morphological phylogenetics.

在基于似然的形态系统发育中使用的模型经常使分子系统发育模型适应形态数据的特殊性。这就是广泛使用的Mkv模型的情况——它引入了采集偏差校正，只采样被观察到是可变的字符——以及字符间速率变化模型（ACRV），研究人员经常使用它来放宽Mkv的等速率假设。然而，可变字符习得偏差与ACRV之间的相互作用尚未得到研究。我们证明，当存在ACRV时，有两种不同的方法来限定可变字符的可能性，我们称之为联合获取偏差和边际获取偏差。这不仅仅是一个微不足道的数学细节，我们还表明，计算可变字符条件似然的方式会导致关于形态学数据集中速率变化如何分布的不同假设。模拟结果表明，当对变量特征的调节与模拟数据不同时，数据中ACRV的树长度和数量会有系统的偏差。此外，对现存和灭绝的分类群的实证研究表明，这不仅对分支长度的估计有潜在的影响，而且对系统发育关系的估计也有潜在的影响。我们建议对ACRV建模的形态学数据集使用边际获取偏差方法。最后，我们敦促系统发育软件的开发人员澄清在估计和模拟中实现了哪些获取偏差校正，并讨论了我们的发现对形态系统发育未来的可变特征建模的影响。

{"title":"On the Mkv Model with Among-Character Rate Variation","authors":"Alessio Capobianco, Sebastian Höhna","doi":"10.1093/sysbio/syaf038","DOIUrl":"https://doi.org/10.1093/sysbio/syaf038","url":null,"abstract":"Models used in likelihood-based morphological phylogenetics often adapt molecular phylogenetics models to the specificities of morphological data. Such is the case for the widely used Mkv model—which introduces an acquisition bias correction for sampling only characters that are observed to be variable—and for models of among-character rate variation (ACRV), routinely applied by researchers to relax the equal-rates assumption of Mkv. However, the interaction between variable character acquisition bias and ACRV has never been explored before. We demonstrate that there are two distinct approaches to condition the likelihood on variable characters when there is ACRV, and we call them joint and marginal acquisition bias. Far from being just a trivial mathematical detail, we show that the way in which the variable character conditional likelihood is calculated results in different assumptions about how rate variation is distributed in morphological datasets. Simulations demonstrate that tree length and amount of ACRV in the data are systematically biased when conditioning on variable characters differently from how the data was simulated. Moreover, an empirical case study with extant and extinct taxa reveals a potential impact not only on the estimation of branch lengths, but also of phylogenetic relationships. We recommend the use of the marginal acquisition bias approach for morphological datasets modeled with ACRV. Finally, we urge developers of phylogenetic software to clarify which acquisition bias correction is implemented for both estimation and simulation, and we discuss the implications of our findings on modeling variable characters for the future of morphological phylogenetics.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"33 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144066939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Batesian Mimicry Converges Towards Inaccuracy in Myrmecomorphic Spiders 蜘蛛的贝叶斯拟态趋近于不准确

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-16 DOI: 10.1093/sysbio/syaf037

Michael B J Kelly, Shahan Derkarabetian, Donald James McLean, Ryan Shofner, Cristian J Grismado, Charles R Haddad, Gerasimos Cassis, Gonzalo Giribet, Marie E Herberstein, Jonas O Wolff

Batesian mimicry is an impressive example of convergent evolution driven by predation. However, the observation that many mimics only superficially resemble their models despite strong selective pressures is an apparent paradox. Here, we tested the ‘perfecting hypothesis’, that posits that inaccurate mimicry may represent a transitional stage at the macro-evolutionary scale by performing the hereto largest phylogenetic analysis (in terms of the number of taxa and genetic data) of ant-mimicking spiders across two speciose but independent clades, the jumping spider tribe Myrmarachnini (Salticidae) and the sac spider sub-family Castianeirinae (Corinnidae). We found that accurate ant mimicry evolved in a gradual process in both clades, by an integration of compound traits contributing to the ant-like habitus with each trait evolving at different speeds. Accurate states were highly unstable at the macro-evolutionary scale likely because strong expression of some of these traits comes with high fitness costs. Instead, the inferred global optimum of mimicry expression was at an inaccurate state. This result reverses the onus of explanation from inaccurate mimicry to explaining the exceptional evolution and maintenance of accurate mimicry and highlights that the evolution of Batesian mimicry is ruled by multiple conflicting selective pressures.

贝叶斯模仿是由捕食驱动的趋同进化的一个令人印象深刻的例子。然而，尽管有强大的选择压力，许多模仿者只是在表面上与他们的模型相似，这一观察结果显然是一个悖论。在这里，我们通过对跳蛛族Myrmarachnini （Salticidae）和囊蛛亚科Castianeirinae （corinidae）这两个独立的进化支进行了迄今为止最大的系统发育分析（就分类群和遗传数据的数量而言），验证了“完善假设”，该假设认为不准确的模仿可能代表了宏观进化尺度上的一个过渡阶段。我们发现，精确的蚂蚁模仿在两个进化支系中都是一个渐进的过程，通过整合复合特征，每个特征以不同的速度进化，形成了类似蚂蚁的习性。精确的状态在宏观进化尺度上是高度不稳定的，可能是因为这些特征的强烈表达伴随着高适应成本。相反，推断的全局最优模仿表达处于不准确状态。这一结果将解释的责任从不准确的模仿转向了解释精确模仿的特殊进化和维持，并强调了贝叶斯模仿的进化是由多种相互冲突的选择压力所支配的。

{"title":"Batesian Mimicry Converges Towards Inaccuracy in Myrmecomorphic Spiders","authors":"Michael B J Kelly, Shahan Derkarabetian, Donald James McLean, Ryan Shofner, Cristian J Grismado, Charles R Haddad, Gerasimos Cassis, Gonzalo Giribet, Marie E Herberstein, Jonas O Wolff","doi":"10.1093/sysbio/syaf037","DOIUrl":"https://doi.org/10.1093/sysbio/syaf037","url":null,"abstract":"Batesian mimicry is an impressive example of convergent evolution driven by predation. However, the observation that many mimics only superficially resemble their models despite strong selective pressures is an apparent paradox. Here, we tested the ‘perfecting hypothesis’, that posits that inaccurate mimicry may represent a transitional stage at the macro-evolutionary scale by performing the hereto largest phylogenetic analysis (in terms of the number of taxa and genetic data) of ant-mimicking spiders across two speciose but independent clades, the jumping spider tribe Myrmarachnini (Salticidae) and the sac spider sub-family Castianeirinae (Corinnidae). We found that accurate ant mimicry evolved in a gradual process in both clades, by an integration of compound traits contributing to the ant-like habitus with each trait evolving at different speeds. Accurate states were highly unstable at the macro-evolutionary scale likely because strong expression of some of these traits comes with high fitness costs. Instead, the inferred global optimum of mimicry expression was at an inaccurate state. This result reverses the onus of explanation from inaccurate mimicry to explaining the exceptional evolution and maintenance of accurate mimicry and highlights that the evolution of Batesian mimicry is ruled by multiple conflicting selective pressures.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"18 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144097490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

phyddle: software for exploring phylogenetic models with deep learning Phyddle：用深度学习探索系统发育模型的软件

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-14 DOI: 10.1093/sysbio/syaf036

Michael J Landis, Ammon Thompson

Phylogenies contain a wealth of information about the evolutionary history and process that gave rise to the diversity of life. This information can be extracted by fitting phylogenetic models to trees. However, many realistic phylogenetic models lack tractable likelihood functions, prohibiting their use with standard inference methods. We present phyddle, pipeline-based software for performing phylogenetic modeling tasks on trees using likelihood-free deep learning approaches. phyddle has a flexible command-line interface, making it easy to integrate deep learning approaches for phylogenetics into research workflows. phyddle coordinates modeling tasks through five pipeline analysis steps (Simulate, Format, Train, Estimate, and Plot) that transform raw phylogenetic datasets as input into numerical and visual model-based output. We conduct three experiments to compare the accuracy of likelihood-based inferences against deep learning-based inferences obtained through phyddle. Benchmarks show that phyddle accurately performs the inference tasks for which it was designed, such as estimating macroevolutionary parameters, selecting among continuous trait evolution models, and passing coverage tests for epidemiological models, even for models that lack tractable likelihoods. Learn more about phyddle at https://phyddle.org.

系统发生学包含了丰富的关于进化历史和产生生命多样性的过程的信息。这些信息可以通过拟合树的系统发育模型来提取。然而，许多现实的系统发育模型缺乏可处理的似然函数，禁止它们与标准推理方法一起使用。我们提出了基于管道的软件，用于使用无似然深度学习方法在树上执行系统发育建模任务。Phyddle具有灵活的命令行界面，可以轻松地将系统发育的深度学习方法集成到研究工作流程中。phyddle通过五个管道分析步骤（模拟、格式化、训练、估计和绘图）来协调建模任务，这些步骤将原始系统发育数据集作为输入转换为基于数值和可视化模型的输出。我们进行了三个实验来比较基于似然的推断和基于深度学习的推断的准确性。基准测试表明，phyddle可以准确地执行其设计的推理任务，例如估计宏观进化参数，在连续特征进化模型中进行选择，以及通过流行病学模型的覆盖测试，甚至对于缺乏可处理可能性的模型。在https://phyddle.org了解更多关于phyddle的信息。

{"title":"phyddle: software for exploring phylogenetic models with deep learning","authors":"Michael J Landis, Ammon Thompson","doi":"10.1093/sysbio/syaf036","DOIUrl":"https://doi.org/10.1093/sysbio/syaf036","url":null,"abstract":"Phylogenies contain a wealth of information about the evolutionary history and process that gave rise to the diversity of life. This information can be extracted by fitting phylogenetic models to trees. However, many realistic phylogenetic models lack tractable likelihood functions, prohibiting their use with standard inference methods. We present phyddle, pipeline-based software for performing phylogenetic modeling tasks on trees using likelihood-free deep learning approaches. phyddle has a flexible command-line interface, making it easy to integrate deep learning approaches for phylogenetics into research workflows. phyddle coordinates modeling tasks through five pipeline analysis steps (Simulate, Format, Train, Estimate, and Plot) that transform raw phylogenetic datasets as input into numerical and visual model-based output. We conduct three experiments to compare the accuracy of likelihood-based inferences against deep learning-based inferences obtained through phyddle. Benchmarks show that phyddle accurately performs the inference tasks for which it was designed, such as estimating macroevolutionary parameters, selecting among continuous trait evolution models, and passing coverage tests for epidemiological models, even for models that lack tractable likelihoods. Learn more about phyddle at https://phyddle.org.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"13 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143980090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unravelling the Effects of Ecology and Evolutionary History in the Phenotypic Convergence of Fishes. 揭示生态学和进化历史对鱼类表型趋同的影响。

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-13 DOI: 10.1093/sysbio/syaf034

Jennifer R Hodge,Danielle S Adams,Keiffer L Williams,Laura R V Alencar,Benjamin Camper,Olivier Larouche,Mason A Thurman,Katerina Zapfe,Samantha A Price

Understanding the ecological drivers and limitations of adaptive convergence is a fundamental challenge. Here, we explore how adaptive convergence of planktivorous fishes has been influenced by multiple ecological factors, evolutionary history, and chance. Using ecomorphological data for over 1600 marine species, we integrate pattern-based metrics of convergence with evolutionary model fitting to test whether phenotypic similarities among specialist planktivores exceed expectations under null models and whether ecology, evolutionary history, or their combined effects best explain trait evolution. We find that planktivores are significantly more similar in phenotype than expected. Traits with functional relevance for prey detection and capture, such as eye diameter and lower jaw length, are strongly convergent, while general body size and shape are constrained by deep divisions between clades where the effects of evolutionary history are most pronounced. Since not all traits undergo strong selection toward a convergent ecomorph, their evolutionary trajectories have not entirely overcome ancestral differences in the multivariate trait space, resulting in a specific form of convergence termed conservatism. We show how adaptive responses to feeding ecology intertwine with other ecological pressures (i.e., light environment) and historical contingency to shape fish phenotype evolution over deep time, offering key insights into the generality of phenotypic evolution.

理解适应性趋同的生态驱动因素和局限性是一个根本性的挑战。本文探讨了浮游鱼类的适应性收敛是如何受到多种生态因素、进化史和偶然性的影响的。利用超过1600种海洋物种的生态形态学数据，我们将基于模式的收敛度量与进化模型拟合相结合，以检验在零模型下，专业浮游动物之间的表型相似性是否超出预期，以及生态学、进化史或它们的综合效应是否能最好地解释性状进化。我们发现浮游动物在表型上明显比预期的更相似。与猎物探测和捕获功能相关的特征，如眼睛直径和下颚长度，是强烈趋同的，而一般的身体大小和形状受到进化历史影响最明显的分支之间的深刻分歧的限制。由于并非所有性状都经历了趋同生态形态的强烈选择，它们的进化轨迹并没有完全克服多元性状空间中的祖先差异，导致一种称为保守性的特定形式的收敛。我们展示了对喂养生态的适应性反应如何与其他生态压力（即光环境）和历史偶然性交织在一起，在较长时间内塑造鱼类的表型进化，为表型进化的普遍性提供了关键见解。

{"title":"Unravelling the Effects of Ecology and Evolutionary History in the Phenotypic Convergence of Fishes.","authors":"Jennifer R Hodge,Danielle S Adams,Keiffer L Williams,Laura R V Alencar,Benjamin Camper,Olivier Larouche,Mason A Thurman,Katerina Zapfe,Samantha A Price","doi":"10.1093/sysbio/syaf034","DOIUrl":"https://doi.org/10.1093/sysbio/syaf034","url":null,"abstract":"Understanding the ecological drivers and limitations of adaptive convergence is a fundamental challenge. Here, we explore how adaptive convergence of planktivorous fishes has been influenced by multiple ecological factors, evolutionary history, and chance. Using ecomorphological data for over 1600 marine species, we integrate pattern-based metrics of convergence with evolutionary model fitting to test whether phenotypic similarities among specialist planktivores exceed expectations under null models and whether ecology, evolutionary history, or their combined effects best explain trait evolution. We find that planktivores are significantly more similar in phenotype than expected. Traits with functional relevance for prey detection and capture, such as eye diameter and lower jaw length, are strongly convergent, while general body size and shape are constrained by deep divisions between clades where the effects of evolutionary history are most pronounced. Since not all traits undergo strong selection toward a convergent ecomorph, their evolutionary trajectories have not entirely overcome ancestral differences in the multivariate trait space, resulting in a specific form of convergence termed conservatism. We show how adaptive responses to feeding ecology intertwine with other ecological pressures (i.e., light environment) and historical contingency to shape fish phenotype evolution over deep time, offering key insights into the generality of phenotypic evolution.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"1 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143945477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Phylogenomic and Population Genomic Analyses of Ultraconserved Elements Reveal Deep Coalescence and Introgression Shaped Diversification Patterns in Lamprologine Cichlids of the Congo River. 超保守基因的系统基因组和种群基因组分析揭示了刚果河Lamprologine慈鲷的深聚结和渐入形多样化模式。

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-13 DOI: 10.1093/sysbio/syaf032

Fernando Alda,S Elizabeth Alter,Naoko P Kurata,Prosanta Chakrabarty,Melanie L J Stiassny

Understanding the drivers of diversification is a central goal in evolutionary biology but can be challenging when lineages radiate quickly and/or hybridize frequently. Cichlids in the tribe Lamprologini, an exceptionally diverse clade found in the Congo basin, exemplify these issues: their evolutionary history has been difficult to untangle with previous datasets, particularly with regard to river-dwelling lineages in the genus Lamprologus. This clade notably includes the only known blind and depigmented cichlid, L. lethops. Here, we reconstructed the evolutionary, population, and biogeographic history of a Lamprologus clade from the Congo River by leveraging genomic data and sampling over 50 lamprologine species from the entire Lake Tanganyika radiation. This study provides the most comprehensive species-level coverage to date of the riverine taxa within this lacustrine-origin clade. We found that in the mid-late Pliocene, two lineages of Lake Tanganyika lamprologines independently colonized the Congo River, where they subsequently hybridized and diversified, forming the current monophyletic group of riverine Lamprologus. Our estimates for divergence time and introgression align with the region's geological history and suggest rapid speciation in Lamprologus species from the Congo River marked by rapids-driven vicariance and water level fluctuations, and repeated episodes of secondary contact and reticulation. As a result of our analyses, we propose the taxonomic restriction of the genus Lamprologus to Congo River taxa only. The complex evolutionary history of this group-characterized by introgressive hybridization followed by a rapid series of isolation and reconnection-illustrates the multifaceted dynamics of speciation that have shaped the rich biodiversity of this region. [African cichlids; Congo River; diversification; hybridization; Lamprologini; phylogenomics; UCEs; ultraconserved elements].

了解多样化的驱动因素是进化生物学的中心目标，但当谱系快速辐射和/或频繁杂交时，这可能具有挑战性。在刚果盆地发现的一个异常多样化的分支Lamprologini部落的稚鱼，例证了这些问题：它们的进化史很难用以前的数据集来解开，特别是关于Lamprologus属的河栖谱系。值得注意的是，这个分支包括唯一已知的盲和脱色慈鲷，L. lethops。在此，我们利用基因组数据和采样整个坦噶尼喀湖辐射的50多个lamprologine物种，重建了刚果河Lamprologus分支的进化，种群和生物地理历史。这项研究提供了迄今为止最全面的物种水平覆盖的河流分类群在这个湖源进化枝。我们发现，在上新世中晚期，坦噶尼喀湖的两个lamprologine谱系独立地在刚果河定居，随后它们在那里杂交和多样化，形成了目前的单系河流Lamprologus群。我们对分化时间和渗进的估计与该地区的地质历史一致，并表明刚果河的Lamprologus物种形成迅速，其特征是急流驱动的变异和水位波动，以及重复的二次接触和网状。根据分析结果，我们提出了Lamprologus属的分类限制仅为刚果河分类群。这个群体复杂的进化史——以渐进杂交为特征，随后是一系列快速的分离和重新联系——说明了物种形成的多方面动态，这些动态塑造了该地区丰富的生物多样性。[非洲丽鱼科鱼;刚果河;多样化;杂化;Lamprologini;phylogenomics;加州大学;ultraconserved元素)。

{"title":"Phylogenomic and Population Genomic Analyses of Ultraconserved Elements Reveal Deep Coalescence and Introgression Shaped Diversification Patterns in Lamprologine Cichlids of the Congo River.","authors":"Fernando Alda,S Elizabeth Alter,Naoko P Kurata,Prosanta Chakrabarty,Melanie L J Stiassny","doi":"10.1093/sysbio/syaf032","DOIUrl":"https://doi.org/10.1093/sysbio/syaf032","url":null,"abstract":"Understanding the drivers of diversification is a central goal in evolutionary biology but can be challenging when lineages radiate quickly and/or hybridize frequently. Cichlids in the tribe Lamprologini, an exceptionally diverse clade found in the Congo basin, exemplify these issues: their evolutionary history has been difficult to untangle with previous datasets, particularly with regard to river-dwelling lineages in the genus Lamprologus. This clade notably includes the only known blind and depigmented cichlid, L. lethops. Here, we reconstructed the evolutionary, population, and biogeographic history of a Lamprologus clade from the Congo River by leveraging genomic data and sampling over 50 lamprologine species from the entire Lake Tanganyika radiation. This study provides the most comprehensive species-level coverage to date of the riverine taxa within this lacustrine-origin clade. We found that in the mid-late Pliocene, two lineages of Lake Tanganyika lamprologines independently colonized the Congo River, where they subsequently hybridized and diversified, forming the current monophyletic group of riverine Lamprologus. Our estimates for divergence time and introgression align with the region's geological history and suggest rapid speciation in Lamprologus species from the Congo River marked by rapids-driven vicariance and water level fluctuations, and repeated episodes of secondary contact and reticulation. As a result of our analyses, we propose the taxonomic restriction of the genus Lamprologus to Congo River taxa only. The complex evolutionary history of this group-characterized by introgressive hybridization followed by a rapid series of isolation and reconnection-illustrates the multifaceted dynamics of speciation that have shaped the rich biodiversity of this region. [African cichlids; Congo River; diversification; hybridization; Lamprologini; phylogenomics; UCEs; ultraconserved elements].","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"44 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143945478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CAnDI: a new tool to investigate conflict in homologous gene trees and explain convergent trait evolution 研究同源基因树冲突和解释趋同性状进化的新工具

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-08 DOI: 10.1093/sysbio/syaf028

Holly M Robertson, Joseph F Walker, Edwige Moyroud

Phenotypic convergence is found across the tree of life, and morphological similarities in distantly related species are often presumed to have evolved independently. However, clarifying the origins of traits has recently highlighted the complex nature of evolution, as apparent convergent features often share similar genetic foundations. Hence, the tree topology of genes that underlie such traits frequently conflicts with the overall history of species relationships. This conflict, which usually results from incomplete lineage sorting, introgression or horizontal gene transfer, creates both a challenge for systematists and an exciting opportunity to investigate the rich, complex network of information that connects molecular trajectories with trait evolution. Here we present a novel conflict identification program named CAnDI (Conflict And Duplication Identifier), which enables the analysis of conflict in homologous gene trees rather than inferred orthologs. We demonstrate that the analysis of conflicts in homologous trees using CAnDI yields more comparisons than in ortholog trees in six datasets from across the eukaryotic tree of life. Using the carnivorous trap of Caryophyllales, a charismatic group of flowering plants, as a case study we demonstrate that analysing conflict on entire homolog trees can aid in inferring the contribution of standing genetic variation to trait evolution: by dissecting all gene relationships within homolog trees, we find genomic evidence that the molecular basis of the pleisiomorphic mucilaginous sticky trap was likely present in the ancestor of all carnivorous Caryophyllales. We also show that many genes whose evolutionary trajectories group species with similar trap devices code for proteins contributing to plant carnivory and identify a LATERAL ORGAN BOUNDARY DOMAIN transcription factor as a possible candidate for regulating sticky trap development.

在整个生命树中发现了表型趋同，而远亲物种的形态相似性通常被认为是独立进化的。然而，澄清特征的起源最近突出了进化的复杂性，因为明显的趋同特征通常具有相似的遗传基础。因此，构成这些特征的基因的树状拓扑结构经常与物种关系的整体历史相冲突。这种冲突通常是由不完整的谱系分选、基因渗入或水平基因转移造成的，这对系统学家来说既是一个挑战，也是一个令人兴奋的机会，可以研究将分子轨迹与性状进化联系起来的丰富而复杂的信息网络。在这里，我们提出了一个新的冲突识别程序，名为CAnDI（冲突和重复标识符），它可以分析同源基因树中的冲突，而不是推断的同源基因。我们证明，在真核生物生命之树的六个数据集中，使用CAnDI分析同源树中的冲突比同源树产生更多的比较。以有魅力的开花植物群石竹属（Caryophyllales）的肉食性陷阱为例，我们证明了分析整个同源树上的冲突有助于推断直立遗传变异对性状进化的贡献：通过解剖同源树上的所有基因关系，我们发现基因组证据表明，多形粘液粘陷阱的分子基础可能存在于所有肉食性石竹属（Caryophyllales）的祖先中。我们还发现，许多基因的进化轨迹将具有相似陷阱装置的物种分类为有助于植物食肉性的蛋白质，并确定了一个侧边器官边界域转录因子作为调节粘性陷阱发育的可能候选基因。

{"title":"CAnDI: a new tool to investigate conflict in homologous gene trees and explain convergent trait evolution","authors":"Holly M Robertson, Joseph F Walker, Edwige Moyroud","doi":"10.1093/sysbio/syaf028","DOIUrl":"https://doi.org/10.1093/sysbio/syaf028","url":null,"abstract":"Phenotypic convergence is found across the tree of life, and morphological similarities in distantly related species are often presumed to have evolved independently. However, clarifying the origins of traits has recently highlighted the complex nature of evolution, as apparent convergent features often share similar genetic foundations. Hence, the tree topology of genes that underlie such traits frequently conflicts with the overall history of species relationships. This conflict, which usually results from incomplete lineage sorting, introgression or horizontal gene transfer, creates both a challenge for systematists and an exciting opportunity to investigate the rich, complex network of information that connects molecular trajectories with trait evolution. Here we present a novel conflict identification program named CAnDI (Conflict And Duplication Identifier), which enables the analysis of conflict in homologous gene trees rather than inferred orthologs. We demonstrate that the analysis of conflicts in homologous trees using CAnDI yields more comparisons than in ortholog trees in six datasets from across the eukaryotic tree of life. Using the carnivorous trap of Caryophyllales, a charismatic group of flowering plants, as a case study we demonstrate that analysing conflict on entire homolog trees can aid in inferring the contribution of standing genetic variation to trait evolution: by dissecting all gene relationships within homolog trees, we find genomic evidence that the molecular basis of the pleisiomorphic mucilaginous sticky trap was likely present in the ancestor of all carnivorous Caryophyllales. We also show that many genes whose evolutionary trajectories group species with similar trap devices code for proteins contributing to plant carnivory and identify a LATERAL ORGAN BOUNDARY DOMAIN transcription factor as a possible candidate for regulating sticky trap development.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"27 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introgression across narrow contact zones shapes the genomic landscape of phylogenetic variation in an African bird clade 跨越狭窄接触带的渗入形成了非洲鸟类进化进化的基因组景观

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-07 DOI: 10.1093/sysbio/syaf033

Loïs Rancilhac, Stacey G de Souza, Sifiso M Lukhele, Matteo Sebastianelli, Bridget O Ogolowa, Michaella Moysi, Christos Nikiforou, Tsyon Asfaw, Colleen T Downs, Alan Brelsford, Bridgett M vonHoldt, Alexander N G Kirschel

Genomic analyses of hybrid zones provide excellent opportunities to investigate the consequences of introgression in nature. In combination with phylogenomics analyses, hybrid zone studies may illuminate the role of ancient and contemporary gene flow in shaping variation of phylogenetic signals across the genome, but this avenue has not been explored yet. We combined phylogenomic and geographic cline analyses in a Pogoniulus tinkerbird clade to determine whether contemporary introgression through hybrid zones contributes to gene-tree heterogeneity across the species ranges. We found diverse phylogenetic signals across the genome with the most common topologies supporting monophyly among taxa connected by secondary contact zones. Remarkably, these systematic conflicts were also recovered when selecting only individuals from each taxon's core range. Using analyses of derived allele sharing and “recombination aware” phylogenomics, we found that introgression shapes gene-tree heterogeneity, and the species tree most likely supports monophyletic red-fronted tinkerbirds, as recovered in previous reconstructions based on mitochondrial DNA. Furthermore, by fitting geographic clines across two secondary contact zones, we found that introgression rates were lower in genomic regions supporting the putative species tree compared to those supporting the two taxa in contact as monophyletic. This demonstrates that introgression through narrow contact zones shapes gene-tree heterogeneity even in allopatric populations. Finally, we did not find evidence that mitochondria-interacting nuclear genes acted as barrier loci. Our results show that species can withstand important amounts of introgression while maintaining their phenotypic integrity and ecological separation, raising questions regarding the genomic architecture of adaptation and barriers to gene flow.

杂交带的基因组分析为研究自然中基因渗入的后果提供了极好的机会。结合系统基因组学分析，杂交区研究可能阐明古代和现代基因流在形成基因组系统发育信号变异中的作用，但这一途径尚未被探索。我们结合了Pogoniulus修补鸟进化枝的系统基因组学和地理渐变分析，以确定通过杂交带的当代渗入是否有助于物种范围内基因树的异质性。我们发现整个基因组具有不同的系统发育信号，最常见的拓扑结构支持由次级接触带连接的类群之间的单系性。值得注意的是，当只从每个分类单元的核心范围中选择个体时，这些系统冲突也被恢复了。通过对衍生等位基因共享和“重组意识”系统基因组学的分析，我们发现基因导入形成了基因树的异质性，并且物种树很可能支持单系红毛修补鸟，正如之前基于线粒体DNA重建所恢复的那样。此外，通过拟合两个次级接触带的地理曲线，我们发现在支持假定物种树的基因组区域，与支持两个接触类群为单系的基因组区域相比，渗入率更低。这表明，即使在异域种群中，通过狭窄接触区的渗入也会形成基因树异质性。最后，我们没有发现线粒体相互作用的核基因作为屏障位点的证据。我们的研究结果表明，物种可以承受大量的基因渗入，同时保持其表型完整性和生态分离，这就提出了关于适应基因组结构和基因流动障碍的问题。

{"title":"Introgression across narrow contact zones shapes the genomic landscape of phylogenetic variation in an African bird clade","authors":"Loïs Rancilhac, Stacey G de Souza, Sifiso M Lukhele, Matteo Sebastianelli, Bridget O Ogolowa, Michaella Moysi, Christos Nikiforou, Tsyon Asfaw, Colleen T Downs, Alan Brelsford, Bridgett M vonHoldt, Alexander N G Kirschel","doi":"10.1093/sysbio/syaf033","DOIUrl":"https://doi.org/10.1093/sysbio/syaf033","url":null,"abstract":"Genomic analyses of hybrid zones provide excellent opportunities to investigate the consequences of introgression in nature. In combination with phylogenomics analyses, hybrid zone studies may illuminate the role of ancient and contemporary gene flow in shaping variation of phylogenetic signals across the genome, but this avenue has not been explored yet. We combined phylogenomic and geographic cline analyses in a Pogoniulus tinkerbird clade to determine whether contemporary introgression through hybrid zones contributes to gene-tree heterogeneity across the species ranges. We found diverse phylogenetic signals across the genome with the most common topologies supporting monophyly among taxa connected by secondary contact zones. Remarkably, these systematic conflicts were also recovered when selecting only individuals from each taxon's core range. Using analyses of derived allele sharing and “recombination aware” phylogenomics, we found that introgression shapes gene-tree heterogeneity, and the species tree most likely supports monophyletic red-fronted tinkerbirds, as recovered in previous reconstructions based on mitochondrial DNA. Furthermore, by fitting geographic clines across two secondary contact zones, we found that introgression rates were lower in genomic regions supporting the putative species tree compared to those supporting the two taxa in contact as monophyletic. This demonstrates that introgression through narrow contact zones shapes gene-tree heterogeneity even in allopatric populations. Finally, we did not find evidence that mitochondria-interacting nuclear genes acted as barrier loci. Our results show that species can withstand important amounts of introgression while maintaining their phenotypic integrity and ecological separation, raising questions regarding the genomic architecture of adaptation and barriers to gene flow.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"119 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimating Genome-wide Phylogenies Using Probabilistic Topic Modeling 利用概率主题模型估计全基因组系统发育

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-05 DOI: 10.1093/sysbio/syaf015

Marzieh Khodaei, Scott V Edwards, Peter Beerli

Methods for rapidly inferring the evolutionary history of species or populations with, genome-wide data are progressing, but computational constraints still limit our abilities in, this area. We developed an alignment-free method to infer genome-wide phylogenies and, implemented it in the Python package TopicContml. The method uses probabilistic, topic modeling (specifically, Latent Dirichlet Allocation or LDA) to extract ‘topic’, frequencies from k-mers, which are derived from multilocus DNA sequences. These, extracted frequencies then serve as an input for the program Contml in the PHYLIP, package, which is used to generate a species tree. We evaluated the performance of, TopicContml on simulated datasets with gaps and three biological datasets: (1) 14 DNA, sequence loci from two Australian bird species distributed across nine populations, (2), 5162 loci from 80 mammal species, and (3) raw, unaligned, non-orthologous PacBio, sequences from 12 bird species. We also assessed the uncertainty of the estimated, relationships among clades using a bootstrap procedure. Our empirical results and, simulated data suggest that our method is efficient and statistically robust.

利用全基因组数据快速推断物种或种群进化史的方法正在取得进展，但计算限制仍然限制了我们在这一领域的能力。我们开发了一种无需比对的方法来推断全基因组的系统发育，并在Python包TopicContml中实现了它。该方法使用概率主题建模（特别是潜狄利克雷分配或LDA）从k-mers中提取“主题”频率，k-mers来自多位点DNA序列。这些被提取的频率然后作为PHYLIP包中的程序Contml的输入，用于生成物种树。我们评估了TopicContml在具有缺口的模拟数据集和3个生物数据集上的性能：(1)分布在9个种群中的2种澳大利亚鸟类的14个DNA序列位点，(2)来自80种哺乳动物的5162个位点，以及(3)来自12种鸟类的原始、未对齐、非同源PacBio序列。我们还评估了估计的不确定性，使用自举程序的分支之间的关系。我们的实证结果和模拟数据表明，我们的方法是有效的和统计稳健性。

{"title":"Estimating Genome-wide Phylogenies Using Probabilistic Topic Modeling","authors":"Marzieh Khodaei, Scott V Edwards, Peter Beerli","doi":"10.1093/sysbio/syaf015","DOIUrl":"https://doi.org/10.1093/sysbio/syaf015","url":null,"abstract":"Methods for rapidly inferring the evolutionary history of species or populations with, genome-wide data are progressing, but computational constraints still limit our abilities in, this area. We developed an alignment-free method to infer genome-wide phylogenies and, implemented it in the Python package TopicContml. The method uses probabilistic, topic modeling (specifically, Latent Dirichlet Allocation or LDA) to extract ‘topic’, frequencies from k-mers, which are derived from multilocus DNA sequences. These, extracted frequencies then serve as an input for the program Contml in the PHYLIP, package, which is used to generate a species tree. We evaluated the performance of, TopicContml on simulated datasets with gaps and three biological datasets: (1) 14 DNA, sequence loci from two Australian bird species distributed across nine populations, (2), 5162 loci from 80 mammal species, and (3) raw, unaligned, non-orthologous PacBio, sequences from 12 bird species. We also assessed the uncertainty of the estimated, relationships among clades using a bootstrap procedure. Our empirical results and, simulated data suggest that our method is efficient and statistically robust.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"50 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143910412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PickMe: Sample selection for species tree reconstruction using coalescent weighted quartets PickMe：使用聚结加权四重奏进行物种树重建的样本选择

IF 6.5 1区生物学 Q1 EVOLUTIONARY BIOLOGY

Systematic Biology

Pub Date : 2025-05-05 DOI: 10.1093/sysbio/syaf017

Joseph Rusinko, Yu Cai, Allison Crysler, Katherine Thompson, Julien Boutte, Mark Fishbein, Shannon C K Straub

After collecting large data sets for phylogenomics studies, researchers must decide which, genes or samples to include when reconstructing a species tree. Incomplete or unreliable, data sets make the empiricist’s decision more difficult. Researchers rely on ad hoc, strategies to maximize sampling while ensuring sufficient data for accurate inferences. An, algorithm called PickMe formalizes the sample selection process, assuming that the, samples evolved under the Tree Multispecies Coalescent model. We propose a Bayesian, framework for selecting samples for species tree analysis. Given a collection of gene trees, we compute a posterior probability for each quartet, describing the likelihood that the, species tree displays this topology. From this, we assign individual samples reliability, scores computed as the average of a scaled version of the posterior probabilities. PickMe, uses these weights to recommend which samples to include in a species tree analysis., Analysis of simulated data showed that including the samples suggested by Pickme, produced species trees closer to the true species trees than both unfiltered data sets and, data sets with ad hoc gene occupancy cut-offs applied. To further illustrate the efficacy of, this tool, we apply PickMe to gene trees generated from target capture data from, milkweeds. PickMe indicates more samples could have reliably been included in a previous, milkweed phylogenomic analysis than the authors analyzed without access to a formal, methodology for sample selection. Using simulated and empirical data, we also compare, PickMe to existing sample selection methods. Inclusion of PickMe will enhance, phylogenomics data analysis pipelines by providing a formal structure for sample selection.

在为系统基因组学研究收集了大量数据集之后，研究人员必须决定在重建物种树时包括哪些基因或样本。不完整或不可靠的数据集使经验主义者的决策更加困难。研究人员依靠特别的策略来最大化采样，同时确保足够的数据进行准确的推断。一种名为PickMe的算法将样本选择过程形式化，该算法假设样本在树多物种聚合模型下进化。我们提出了一个贝叶斯框架来选择样本进行物种树分析。给定一组基因树，我们计算每个四重奏的后验概率，描述物种树显示这种拓扑结构的可能性。由此，我们分配单个样本的可靠性，分数计算为后验概率的缩放版本的平均值。PickMe使用这些权重来推荐在物种树分析中包含哪些样本。对模拟数据的分析表明，包括Pickme建议的样本，所产生的物种树比未过滤的数据集和使用特设基因占用截止值的数据集更接近真实的物种树。为了进一步说明该工具的有效性，我们将PickMe应用于从乳草的目标捕获数据生成的基因树。PickMe指出，与作者在没有正式的样本选择方法的情况下分析的样本相比，更多的样本可以可靠地包括在以前的乳草系统基因组分析中。使用模拟和经验数据，我们还比较了，PickMe与现有的样本选择方法。包含PickMe将通过提供样本选择的正式结构来增强系统基因组学数据分析管道。

{"title":"PickMe: Sample selection for species tree reconstruction using coalescent weighted quartets","authors":"Joseph Rusinko, Yu Cai, Allison Crysler, Katherine Thompson, Julien Boutte, Mark Fishbein, Shannon C K Straub","doi":"10.1093/sysbio/syaf017","DOIUrl":"https://doi.org/10.1093/sysbio/syaf017","url":null,"abstract":"After collecting large data sets for phylogenomics studies, researchers must decide which, genes or samples to include when reconstructing a species tree. Incomplete or unreliable, data sets make the empiricist’s decision more difficult. Researchers rely on ad hoc, strategies to maximize sampling while ensuring sufficient data for accurate inferences. An, algorithm called PickMe formalizes the sample selection process, assuming that the, samples evolved under the Tree Multispecies Coalescent model. We propose a Bayesian, framework for selecting samples for species tree analysis. Given a collection of gene trees, we compute a posterior probability for each quartet, describing the likelihood that the, species tree displays this topology. From this, we assign individual samples reliability, scores computed as the average of a scaled version of the posterior probabilities. PickMe, uses these weights to recommend which samples to include in a species tree analysis., Analysis of simulated data showed that including the samples suggested by Pickme, produced species trees closer to the true species trees than both unfiltered data sets and, data sets with ad hoc gene occupancy cut-offs applied. To further illustrate the efficacy of, this tool, we apply PickMe to gene trees generated from target capture data from, milkweeds. PickMe indicates more samples could have reliably been included in a previous, milkweed phylogenomic analysis than the authors analyzed without access to a formal, methodology for sample selection. Using simulated and empirical data, we also compare, PickMe to existing sample selection methods. Inclusion of PickMe will enhance, phylogenomics data analysis pipelines by providing a formal structure for sample selection.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"29 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143910413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0