Brooke Bodensteiner, Edward D Burress, Martha M Muñoz
Adaptive radiation involves diversification along multiple trait axes, producing phenotypically diverse, species-rich lineages. Theory generally predicts that multi-trait evolution occurs via a "stages" model, with some traits saturating early in a lineage's history, and others diversifying later. Despite its multidimensional nature, however, we know surprisingly little about how different suites of traits evolve during adaptive radiation. Here, we investigated the rate, pattern, and timing of morphological and physiological evolution in the anole lizard adaptive radiation from the Caribbean island of Hispaniola. Rates and patterns of morphological and physiological diversity are largely unaligned, corresponding to independent selective pressures associated with structural and thermal niches. Cold tolerance evolution reflects parapatric divergence across elevation, rather than niche partitioning within communities. Heat tolerance evolution and the preferred temperature evolve more slowly than cold tolerance, reflecting behavioral buffering, particularly in edge-habitat species (a pattern associated with the Bogert effect). In contrast to the nearby island of Puerto Rico, closely related anoles on Hispaniola do not sympatrically partition thermal niche space. Instead, allopatric and parapatric separation across biogeographic and environmental boundaries serves to keep morphologically similar close relatives apart. The phenotypic diversity of this island's adaptive radiation accumulated largely as a by-product of time, with surprisingly few exceptional pulses of trait evolution. A better understanding of the processes that guide multidimensional trait evolution (and nuance therein) will prove key in determining whether the stages model should be considered a common theme of adaptive radiation.
{"title":"Adaptive Radiation Without Independent Stages of Trait Evolution in a Group of Caribbean Anoles.","authors":"Brooke Bodensteiner, Edward D Burress, Martha M Muñoz","doi":"10.1093/sysbio/syae041","DOIUrl":"10.1093/sysbio/syae041","url":null,"abstract":"<p><p>Adaptive radiation involves diversification along multiple trait axes, producing phenotypically diverse, species-rich lineages. Theory generally predicts that multi-trait evolution occurs via a \"stages\" model, with some traits saturating early in a lineage's history, and others diversifying later. Despite its multidimensional nature, however, we know surprisingly little about how different suites of traits evolve during adaptive radiation. Here, we investigated the rate, pattern, and timing of morphological and physiological evolution in the anole lizard adaptive radiation from the Caribbean island of Hispaniola. Rates and patterns of morphological and physiological diversity are largely unaligned, corresponding to independent selective pressures associated with structural and thermal niches. Cold tolerance evolution reflects parapatric divergence across elevation, rather than niche partitioning within communities. Heat tolerance evolution and the preferred temperature evolve more slowly than cold tolerance, reflecting behavioral buffering, particularly in edge-habitat species (a pattern associated with the Bogert effect). In contrast to the nearby island of Puerto Rico, closely related anoles on Hispaniola do not sympatrically partition thermal niche space. Instead, allopatric and parapatric separation across biogeographic and environmental boundaries serves to keep morphologically similar close relatives apart. The phenotypic diversity of this island's adaptive radiation accumulated largely as a by-product of time, with surprisingly few exceptional pulses of trait evolution. A better understanding of the processes that guide multidimensional trait evolution (and nuance therein) will prove key in determining whether the stages model should be considered a common theme of adaptive radiation.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"743-757"},"PeriodicalIF":6.1,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R Alexander Pyron, Kyle A O'Connell, Edward A Myers, David A Beamer, Hector Baños
Reticulation between incipient lineages is a common feature of diversification. We examine these phenomena in the Pisgah clade of Desmognathus salamanders from the southern Appalachian Mountains of the eastern United States. The group contains four to seven species exhibiting two discrete phenotypes, aquatic "shovel-nosed" and semi-aquatic "black-bellied" forms. These ecomorphologies are ancient and have apparently been transmitted repeatedly between lineages through introgression. Geographically proximate populations of both phenotypes exhibit admixture, and at least two black-bellied lineages have been produced via reticulations between shovel-nosed parentals, suggesting potential hybrid speciation dynamics. However, computational constraints currently limit our ability to reconstruct network radiations from gene-tree data. Available methods are limited to level-1 networks wherein reticulations do not share edges, and higher-level networks may be non-identifiable in many cases. We present a heuristic approach to recover information from higher-level networks across a range of potentially identifiable empirical scenarios, supported by theory and simulation. When extrinsic information indicates the location and direction of reticulations, our method can successfully estimate a reduced possible set of non-level-1 networks. Phylogenomic data support a single backbone topology with up to five overlapping hybrid edges in the Pisgah clade. These results suggest an unusual mechanism of ecomorphological hybrid speciation, wherein a binary threshold trait causes some hybrid populations to shift between microhabitat niches, promoting ecological divergence between sympatric hybrids and parentals. This contrasts with other well-known systems in which hybrids exhibit intermediate, novel, or transgressive phenotypes. The genetic basis of these phenotypes is unclear and further data are needed to clarify the evolutionary basis of morphological changes with ecological consequences.
{"title":"Complex Hybridization in a Clade of Polytypic Salamanders (Plethodontidae: Desmognathus) Uncovered by Estimating Higher-Level Phylogenetic Networks.","authors":"R Alexander Pyron, Kyle A O'Connell, Edward A Myers, David A Beamer, Hector Baños","doi":"10.1093/sysbio/syae060","DOIUrl":"https://doi.org/10.1093/sysbio/syae060","url":null,"abstract":"<p><p>Reticulation between incipient lineages is a common feature of diversification. We examine these phenomena in the Pisgah clade of Desmognathus salamanders from the southern Appalachian Mountains of the eastern United States. The group contains four to seven species exhibiting two discrete phenotypes, aquatic \"shovel-nosed\" and semi-aquatic \"black-bellied\" forms. These ecomorphologies are ancient and have apparently been transmitted repeatedly between lineages through introgression. Geographically proximate populations of both phenotypes exhibit admixture, and at least two black-bellied lineages have been produced via reticulations between shovel-nosed parentals, suggesting potential hybrid speciation dynamics. However, computational constraints currently limit our ability to reconstruct network radiations from gene-tree data. Available methods are limited to level-1 networks wherein reticulations do not share edges, and higher-level networks may be non-identifiable in many cases. We present a heuristic approach to recover information from higher-level networks across a range of potentially identifiable empirical scenarios, supported by theory and simulation. When extrinsic information indicates the location and direction of reticulations, our method can successfully estimate a reduced possible set of non-level-1 networks. Phylogenomic data support a single backbone topology with up to five overlapping hybrid edges in the Pisgah clade. These results suggest an unusual mechanism of ecomorphological hybrid speciation, wherein a binary threshold trait causes some hybrid populations to shift between microhabitat niches, promoting ecological divergence between sympatric hybrids and parentals. This contrasts with other well-known systems in which hybrids exhibit intermediate, novel, or transgressive phenotypes. The genetic basis of these phenotypes is unclear and further data are needed to clarify the evolutionary basis of morphological changes with ecological consequences.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos A Maya-Lastra, Patrick W Sweeney, Deren A R Eaton, Vania Torrez, Carla Maldonado, Malu I Ore-Rengifo, Mónica Arakaki, Michael J Donoghue, Erika J Edwards
A fundamental objective of evolutionary biology is to understand the origin of independently evolving species. Phylogenetic studies of species radiations rarely are able to document ongoing speciation; instead, modes of speciation, entailing geographic separation and/or ecological differentiation, are posited retrospectively. The Oreinotinus clade of Viburnum has radiated recently from north to south through the cloud forests of Mexico and Central America to the Central Andes. Our analyses support a hypothesis of incipient speciation in Oreinotinus at the southern edge of its geographic range, from central Peru to northern Argentina. Although several species and infraspecific taxa have been recognized in this area, multiple lines of evidence and analytical approaches (including analyses of phylogenetic relationships, genetic structure, leaf morphology, and climatic envelopes) favor the recognition of just a single species, V. seemenii. We show that what has previously been recognized as V. seemenii f. minor has recently occupied the drier Tucuman-Bolivian forest region from Samaipata in Bolivia to Salta in northern Argentina. Plants in these populations form a well-supported clade with a distinctive genetic signature and they have evolved smaller, narrower leaves. We interpret this as the beginning of a within-species divergence process that has elsewhere in the neotropics resulted repeatedly in Viburnum species with a particular set of leaf ecomorphs. Specifically, the southern populations are in the process of evolving the small, glabrous, and entire leaf ecomorph that has evolved in four other montane areas of endemism. As predicted based on our studies of leaf ecomorphs in Chiapas, Mexico, these southern populations experience generally drier conditions, with large diurnal temperature fluctuations. In a central portion of the range of V. seemenii, characterized by wetter climatic conditions, we also document what may be the initial differentiation of the leaf ecomorph with larger, pubescent, and toothy leaves. The emergence of these ecomorphs thus appears to be driven by adaptation to subtly different climatic conditions in separate geographic regions, as opposed to parapatric differentiation along elevational gradients as suggested by Viburnum species distributions in other parts of the neotropics.
进化生物学的一个基本目标是了解独立进化物种的起源。物种辐射的系统发育研究很少能够记录正在进行的物种分化;相反,物种分化的模式,包括地理分离和/或生态分化,都是回溯性的假设。最近,紫云英的 Oreinotinus 支系从北向南辐射,穿过墨西哥和中美洲的云雾林,到达中安第斯山脉。我们的分析支持一种假设,即在秘鲁中部到阿根廷北部的 Oreinotinus 地理分布范围的南部边缘,其物种刚刚开始分化。尽管在这一地区已经发现了多个物种和种下类群,但多种证据和分析方法(包括系统发育关系、遗传结构、叶片形态学和气候包络分析)都支持只承认一个物种,即 V. seemenii。我们的研究表明,以前被认为是 V. seemenii f. minor 的物种最近占据了从玻利维亚的萨马伊帕塔到阿根廷北部萨尔塔的图库曼-玻利维亚森林地区。这些种群中的植物形成了一个具有独特遗传特征的支持良好的支系,它们进化出了更小、更窄的叶片。我们将此解释为种内分化过程的开始,这种分化过程在新热带地区的其他地方曾多次导致紫云英物种具有一套特殊的叶片非形态。具体来说,南方种群正在进化出小的、无毛的、全缘叶的叶片异形,而这种叶片异形已经在其他四个山地特有地区进化出来了。根据我们对墨西哥恰帕斯州叶片异形的研究预测,这些南部种群通常较为干燥,昼夜温度波动较大。在V. seemenii分布区的中部,气候条件较为潮湿,我们还记录了叶片的最初分化,叶片较大、有短柔毛和齿状突起。因此,这些非形态的出现似乎是为了适应不同地理区域微妙不同的气候条件,而不是像新热带地区其他地方的紫云英物种分布所表明的那样,沿着海拔梯度进行同域分化。
{"title":"Caught in the Act: Incipient Speciation at the Southern Limit of Viburnum in the Central Andes.","authors":"Carlos A Maya-Lastra, Patrick W Sweeney, Deren A R Eaton, Vania Torrez, Carla Maldonado, Malu I Ore-Rengifo, Mónica Arakaki, Michael J Donoghue, Erika J Edwards","doi":"10.1093/sysbio/syae023","DOIUrl":"10.1093/sysbio/syae023","url":null,"abstract":"<p><p>A fundamental objective of evolutionary biology is to understand the origin of independently evolving species. Phylogenetic studies of species radiations rarely are able to document ongoing speciation; instead, modes of speciation, entailing geographic separation and/or ecological differentiation, are posited retrospectively. The Oreinotinus clade of Viburnum has radiated recently from north to south through the cloud forests of Mexico and Central America to the Central Andes. Our analyses support a hypothesis of incipient speciation in Oreinotinus at the southern edge of its geographic range, from central Peru to northern Argentina. Although several species and infraspecific taxa have been recognized in this area, multiple lines of evidence and analytical approaches (including analyses of phylogenetic relationships, genetic structure, leaf morphology, and climatic envelopes) favor the recognition of just a single species, V. seemenii. We show that what has previously been recognized as V. seemenii f. minor has recently occupied the drier Tucuman-Bolivian forest region from Samaipata in Bolivia to Salta in northern Argentina. Plants in these populations form a well-supported clade with a distinctive genetic signature and they have evolved smaller, narrower leaves. We interpret this as the beginning of a within-species divergence process that has elsewhere in the neotropics resulted repeatedly in Viburnum species with a particular set of leaf ecomorphs. Specifically, the southern populations are in the process of evolving the small, glabrous, and entire leaf ecomorph that has evolved in four other montane areas of endemism. As predicted based on our studies of leaf ecomorphs in Chiapas, Mexico, these southern populations experience generally drier conditions, with large diurnal temperature fluctuations. In a central portion of the range of V. seemenii, characterized by wetter climatic conditions, we also document what may be the initial differentiation of the leaf ecomorph with larger, pubescent, and toothy leaves. The emergence of these ecomorphs thus appears to be driven by adaptation to subtly different climatic conditions in separate geographic regions, as opposed to parapatric differentiation along elevational gradients as suggested by Viburnum species distributions in other parts of the neotropics.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"629-643"},"PeriodicalIF":6.1,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141238062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giorgio Bianchini, Martin Hagemann, Patricia Sánchez-Baracaldo
Cyanobacteria are the only prokaryotes to have evolved oxygenic photosynthesis paving the way for complex life. Studying the evolution and ecological niche of cyanobacteria and their ancestors is crucial for understanding the intricate dynamics of biosphere evolution. These organisms frequently deal with environmental stressors such as salinity and drought, and they employ compatible solutes as a mechanism to cope with these challenges. Compatible solutes are small molecules that help maintain cellular osmotic balance in high-salinity environments, such as marine waters. Their production plays a crucial role in salt tolerance, which, in turn, influences habitat preference. Among the 5 known compatible solutes produced by cyanobacteria (sucrose, trehalose, glucosylglycerol, glucosylglycerate, and glycine betaine), their synthesis varies between individual strains. In this study, we work in a Bayesian stochastic mapping framework, integrating multiple sources of information about compatible solute biosynthesis in order to predict the ancestral habitat preference of Cyanobacteria. Through extensive model selection analyses and statistical tests for correlation, we identify glucosylglycerol and glucosylglycerate as the most significantly correlated with habitat preference, while trehalose exhibits the weakest correlation. Additionally, glucosylglycerol, glucosylglycerate, and glycine betaine show high loss/gain rate ratios, indicating their potential role in adaptability, while sucrose and trehalose are less likely to be lost due to their additional cellular functions. Contrary to previous findings, our analyses predict that the last common ancestor of Cyanobacteria (living at around 3180 Ma) had a 97% probability of a high salinity habitat preference and was likely able to synthesize glucosylglycerol and glucosylglycerate. Nevertheless, cyanobacteria likely colonized low-salinity environments shortly after their origin, with an 89% probability of the first cyanobacterium with low-salinity habitat preference arising prior to the Great Oxygenation Event (2460 Ma). Stochastic mapping analyses provide evidence of cyanobacteria inhabiting early marine habitats, aiding in the interpretation of the geological record. Our age estimate of ~2590 Ma for the divergence of 2 major cyanobacterial clades (Macro- and Microcyanobacteria) suggests that these were likely significant contributors to primary productivity in marine habitats in the lead-up to the Great Oxygenation Event, and thus played a pivotal role in triggering the sudden increase in atmospheric oxygen.
{"title":"Stochastic Character Mapping, Bayesian Model Selection, and Biosynthetic Pathways Shed New Light on the Evolution of Habitat Preference in Cyanobacteria.","authors":"Giorgio Bianchini, Martin Hagemann, Patricia Sánchez-Baracaldo","doi":"10.1093/sysbio/syae025","DOIUrl":"10.1093/sysbio/syae025","url":null,"abstract":"<p><p>Cyanobacteria are the only prokaryotes to have evolved oxygenic photosynthesis paving the way for complex life. Studying the evolution and ecological niche of cyanobacteria and their ancestors is crucial for understanding the intricate dynamics of biosphere evolution. These organisms frequently deal with environmental stressors such as salinity and drought, and they employ compatible solutes as a mechanism to cope with these challenges. Compatible solutes are small molecules that help maintain cellular osmotic balance in high-salinity environments, such as marine waters. Their production plays a crucial role in salt tolerance, which, in turn, influences habitat preference. Among the 5 known compatible solutes produced by cyanobacteria (sucrose, trehalose, glucosylglycerol, glucosylglycerate, and glycine betaine), their synthesis varies between individual strains. In this study, we work in a Bayesian stochastic mapping framework, integrating multiple sources of information about compatible solute biosynthesis in order to predict the ancestral habitat preference of Cyanobacteria. Through extensive model selection analyses and statistical tests for correlation, we identify glucosylglycerol and glucosylglycerate as the most significantly correlated with habitat preference, while trehalose exhibits the weakest correlation. Additionally, glucosylglycerol, glucosylglycerate, and glycine betaine show high loss/gain rate ratios, indicating their potential role in adaptability, while sucrose and trehalose are less likely to be lost due to their additional cellular functions. Contrary to previous findings, our analyses predict that the last common ancestor of Cyanobacteria (living at around 3180 Ma) had a 97% probability of a high salinity habitat preference and was likely able to synthesize glucosylglycerol and glucosylglycerate. Nevertheless, cyanobacteria likely colonized low-salinity environments shortly after their origin, with an 89% probability of the first cyanobacterium with low-salinity habitat preference arising prior to the Great Oxygenation Event (2460 Ma). Stochastic mapping analyses provide evidence of cyanobacteria inhabiting early marine habitats, aiding in the interpretation of the geological record. Our age estimate of ~2590 Ma for the divergence of 2 major cyanobacterial clades (Macro- and Microcyanobacteria) suggests that these were likely significant contributors to primary productivity in marine habitats in the lead-up to the Great Oxygenation Event, and thus played a pivotal role in triggering the sudden increase in atmospheric oxygen.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"644-665"},"PeriodicalIF":6.1,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11505929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lena Collienne, Mary Barker, Marc A Suchard, Frederick A Matsen IV
Online phylogenetic inference methods add sequentially arriving sequences to an inferred phylogeny without the need to recompute the entire tree from scratch. Some online method implementations exist already, but there remains concern that additional sequences may change the topological relationship among the original set of taxa. We call such a change in tree topology a lack of stability for the inferred tree. In this paper, we analyze the stability of single taxon addition in a Maximum Likelihood framework across 1, 000 empirical datasets. We find that instability occurs in almost 90% of our examples, although observed topological differences do not always reach significance under the AU-test. Changes in tree topology after addition of a taxon rarely occur close to its attachment location, and are more frequently observed in more distant tree locations carrying low bootstrap support. To investigate whether instability is predictable, we hypothesize sources of instability and design summary statistics addressing these hypotheses. Using these summary statistics as input features for machine learning under random forests, we are able to predict instability and can identify the most influential features. In summary, it does not appear that a strict insertion-only online inference method will deliver globally optimal trees, although relaxing insertion strictness by allowing for a small number of final tree rearrangements or accepting slightly suboptimal solutions appears feasible.
在线系统发育推断方法可将连续到达的序列添加到推断的系统发育中,而无需从头开始重新计算整棵树。目前已经有一些在线方法的实现,但人们仍然担心额外的序列可能会改变原始分类群之间的拓扑关系。我们将这种树拓扑结构的变化称为推断树缺乏稳定性。在本文中,我们在最大似然法框架下分析了 1,000 个经验数据集中单个分类群增加的稳定性。我们发现几乎 90% 的实例都存在不稳定性,尽管在 AU 检验中观察到的拓扑差异并不总是达到显著性。加入一个分类群后,树拓扑结构的变化很少发生在其附着位置附近,而更多地发生在较远的树位置,且引导支持率较低。为了研究不稳定性是否可以预测,我们假设了不稳定性的来源,并针对这些假设设计了汇总统计量。使用这些汇总统计作为随机森林下机器学习的输入特征,我们能够预测不稳定性,并能识别出最有影响力的特征。总之,严格的只插入在线推理方法似乎无法提供全局最优树,不过通过允许少量最终树重新排列或接受略微次优的解决方案来放宽插入的严格性似乎是可行的。
{"title":"Phylogenetic tree instability after taxon addition: empirical frequency, predictability, and consequences for online inference","authors":"Lena Collienne, Mary Barker, Marc A Suchard, Frederick A Matsen IV","doi":"10.1093/sysbio/syae059","DOIUrl":"https://doi.org/10.1093/sysbio/syae059","url":null,"abstract":"Online phylogenetic inference methods add sequentially arriving sequences to an inferred phylogeny without the need to recompute the entire tree from scratch. Some online method implementations exist already, but there remains concern that additional sequences may change the topological relationship among the original set of taxa. We call such a change in tree topology a lack of stability for the inferred tree. In this paper, we analyze the stability of single taxon addition in a Maximum Likelihood framework across 1, 000 empirical datasets. We find that instability occurs in almost 90% of our examples, although observed topological differences do not always reach significance under the AU-test. Changes in tree topology after addition of a taxon rarely occur close to its attachment location, and are more frequently observed in more distant tree locations carrying low bootstrap support. To investigate whether instability is predictable, we hypothesize sources of instability and design summary statistics addressing these hypotheses. Using these summary statistics as input features for machine learning under random forests, we are able to predict instability and can identify the most influential features. In summary, it does not appear that a strict insertion-only online inference method will deliver globally optimal trees, although relaxing insertion strictness by allowing for a small number of final tree rearrangements or accepting slightly suboptimal solutions appears feasible.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"31 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142490398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xenacoelomorpha are mostly microscopic, morphologically simple worms, lacking many structures typical of other bilaterians. Xenacoelomorphs –which include three main groups: Acoela, Nemertodermatida, and Xenoturbella– have been proposed to be an early diverging Bilateria, sister to protostomes and deuterostomes, but other phylogenomic analyses have recovered this clade nested within the deuterostomes, as sister to Ambulacraria. The position of Xenacoelomorpha within the metazoan tree has understandably attracted a lot of attention, overshadowing the study of phylogenetic relationships within this group. Given that Xenoturbella includes only six species whose relationships are well understood, we decided to focus on the most speciose Acoelomorpha (Acoela + Nemertodermatida). Here, we have sequenced 29 transcriptomes, doubling the number of sequenced species, to infer a backbone tree for Acoelomorpha based on genomic data. The recovered topology is mostly congruent with previous studies. The most important difference is the recovery of Paratomella as the first off-shoot within Acoela, dramatically changing the reconstruction of the ancestral acoel. Besides, we have detected incongruence between the gene trees and the species tree, likely linked to incomplete lineage sorting, and some signal of introgression between the families Dakuidae and Mecynostomidae, which hampers inferring the correct placement of this family and, particularly, of the genus Notocelis. We have also used this dataset to infer for the first time diversification times within Acoelomorpha, which coincide with known bilaterian diversification and extinction events. Given the importance of morphological data in acoelomorph phylogenetics, we tested several partitions and models. Although morphological data failed to recover a robust phylogeny, phylogenetic placement has proven to be a suitable alternative when a reference phylogeny is available.
{"title":"A Phylogenomic Backbone for Acoelomorpha Inferred from Transcriptomic Data","authors":"Samuel Abalde, Ulf Jondelius","doi":"10.1093/sysbio/syae057","DOIUrl":"https://doi.org/10.1093/sysbio/syae057","url":null,"abstract":"Xenacoelomorpha are mostly microscopic, morphologically simple worms, lacking many structures typical of other bilaterians. Xenacoelomorphs –which include three main groups: Acoela, Nemertodermatida, and Xenoturbella– have been proposed to be an early diverging Bilateria, sister to protostomes and deuterostomes, but other phylogenomic analyses have recovered this clade nested within the deuterostomes, as sister to Ambulacraria. The position of Xenacoelomorpha within the metazoan tree has understandably attracted a lot of attention, overshadowing the study of phylogenetic relationships within this group. Given that Xenoturbella includes only six species whose relationships are well understood, we decided to focus on the most speciose Acoelomorpha (Acoela + Nemertodermatida). Here, we have sequenced 29 transcriptomes, doubling the number of sequenced species, to infer a backbone tree for Acoelomorpha based on genomic data. The recovered topology is mostly congruent with previous studies. The most important difference is the recovery of Paratomella as the first off-shoot within Acoela, dramatically changing the reconstruction of the ancestral acoel. Besides, we have detected incongruence between the gene trees and the species tree, likely linked to incomplete lineage sorting, and some signal of introgression between the families Dakuidae and Mecynostomidae, which hampers inferring the correct placement of this family and, particularly, of the genus Notocelis. We have also used this dataset to infer for the first time diversification times within Acoelomorpha, which coincide with known bilaterian diversification and extinction events. Given the importance of morphological data in acoelomorph phylogenetics, we tested several partitions and models. Although morphological data failed to recover a robust phylogeny, phylogenetic placement has proven to be a suitable alternative when a reference phylogeny is available.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"97 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142489582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
George P Tiley, Andrew A Crowl, Paul S Manos, Emily B Sessa, Claudia Solís-Lemus, Anne D Yoder, J Gordon Burleigh
Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared with haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where the depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared with using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical nonidentifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.
{"title":"Benefits and Limits of Phasing Alleles for Network Inference of Allopolyploid Complexes.","authors":"George P Tiley, Andrew A Crowl, Paul S Manos, Emily B Sessa, Claudia Solís-Lemus, Anne D Yoder, J Gordon Burleigh","doi":"10.1093/sysbio/syae024","DOIUrl":"10.1093/sysbio/syae024","url":null,"abstract":"<p><p>Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared with haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where the depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared with using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical nonidentifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"666-682"},"PeriodicalIF":6.1,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140908806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost. Instead, phylogenetic pipelines generally consist of sequential analyses, whereby a single point estimate from a given analysis is used as input for the next analysis (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step, which can lead to inaccurate or spuriously confident results. Here, we formally develop and test a sequential inference approach for Bayesian phylogenetic inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior distribution produced in the previous step. Our sequential inference approach presented here not only accounts for uncertainty between analysis steps but also allows for greater flexibility in software choice (and hence model availability) and can be computationally more efficient than the traditional joint inference approach when multiple models are being tested. We show that our sequential inference approach is identical in practice to the joint inference approach only if sufficient information in the data is present (a narrow posterior distribution) and/or sufficiently many important samples are used. Conversely, we show that the common practice of using a single point estimate can be biased, for example, a single phylogeny estimate can transform an unrooted phylogeny into a time-calibrated phylogeny. We demonstrate the theory of sequential Bayesian inference using both a toy example and an empirical case study of divergence-time estimation in insects using a relaxed clock model from transcriptome data. In the empirical example, we estimate 3 posterior distributions of branch lengths from the same data (DNA character matrix with a GTR+Γ+I substitution model, an amino acid data matrix with empirical substitution models, and an amino acid data matrix with the PhyloBayes CAT-GTR model). Finally, we apply 3 different node-calibration strategies and show that divergence time estimates are affected by both the data source and underlying substitution process to estimate branch lengths as well as the node-calibration strategies. Thus, our new sequential Bayesian phylogenetic inference provides the opportunity to efficiently test different approaches for divergence time estimation, including branch-length estimation from other software.
贝叶斯系统发育推断的理想方法是在单一分层模型中联合估计所有相关参数。然而,由于计算成本较高,这在实践中往往并不可行。取而代之的是,系统发育管道一般由连续分析组成,即把给定分析中的单点估计值作为下一步分析的输入(例如,用单个多序列比对来估计基因树)。在这个框架中,不确定性不会从一个步骤传播到另一个步骤,这可能导致不准确或虚假的可信结果。在这里,我们正式开发并测试了一种贝叶斯系统发育推断的顺序推断方法,该方法使用重要性采样从上一步产生的后验分布中为下一步分析流水线生成观测值。我们在此介绍的顺序推断方法不仅考虑了分析步骤之间的不确定性,而且在软件选择(从而模型可用性)方面具有更大的灵活性,并且在测试多个模型时比传统的联合推断方法计算效率更高。我们的研究表明,只有当数据中存在足够的信息(窄后验分布)和/或使用了足够多的重要性样本时,我们的顺序推断方法在实践中才与联合推断方法相同。相反,我们证明了使用单点估计的常见做法可能存在偏差,例如,使用单个系统发育估计将未根系统发育转化为时间校准系统发育。我们通过一个玩具示例和一个实证案例研究证明了序列贝叶斯推断理论,即利用转录组数据中的松弛时钟模型对昆虫的分化时间进行估计。在经验示例中,我们从相同的数据(采用 GTR+Γ+I 替代模型的 DNA 特征矩阵、采用经验替代模型的氨基酸数据矩阵和采用 PhyloBayes CAT-GTR 模型的氨基酸数据矩阵)中估计了三个分支长度的后验分布。最后,我们应用了三种不同的节点校准策略,结果表明分歧时间估计值既受数据源和基础替代过程的影响,也受估计分支长度的节点校准策略的影响。因此,我们新的序列贝叶斯系统发育推断方法为有效测试不同的分歧时间估计方法(包括其他软件的分支长度估计方法)提供了机会。
{"title":"Sequential Bayesian Phylogenetic Inference.","authors":"Sebastian Höhna, Allison Y Hsiang","doi":"10.1093/sysbio/syae020","DOIUrl":"10.1093/sysbio/syae020","url":null,"abstract":"<p><p>The ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost. Instead, phylogenetic pipelines generally consist of sequential analyses, whereby a single point estimate from a given analysis is used as input for the next analysis (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step, which can lead to inaccurate or spuriously confident results. Here, we formally develop and test a sequential inference approach for Bayesian phylogenetic inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior distribution produced in the previous step. Our sequential inference approach presented here not only accounts for uncertainty between analysis steps but also allows for greater flexibility in software choice (and hence model availability) and can be computationally more efficient than the traditional joint inference approach when multiple models are being tested. We show that our sequential inference approach is identical in practice to the joint inference approach only if sufficient information in the data is present (a narrow posterior distribution) and/or sufficiently many important samples are used. Conversely, we show that the common practice of using a single point estimate can be biased, for example, a single phylogeny estimate can transform an unrooted phylogeny into a time-calibrated phylogeny. We demonstrate the theory of sequential Bayesian inference using both a toy example and an empirical case study of divergence-time estimation in insects using a relaxed clock model from transcriptome data. In the empirical example, we estimate 3 posterior distributions of branch lengths from the same data (DNA character matrix with a GTR+Γ+I substitution model, an amino acid data matrix with empirical substitution models, and an amino acid data matrix with the PhyloBayes CAT-GTR model). Finally, we apply 3 different node-calibration strategies and show that divergence time estimates are affected by both the data source and underlying substitution process to estimate branch lengths as well as the node-calibration strategies. Thus, our new sequential Bayesian phylogenetic inference provides the opportunity to efficiently test different approaches for divergence time estimation, including branch-length estimation from other software.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"704-721"},"PeriodicalIF":6.1,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141071866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kate Truman, Timothy G Vaughan, Alex Gavryushkin, Alexandra Sasha Gavryushkina
Time-dependent birth-death sampling models have been used in numerous studies for inferring past evolutionary dynamics in different biological contexts, e.g. speciation and extinction rates in macroevolutionary studies, or effective reproductive number in epidemiological studies. These models are branching processes where lineages can bifurcate, die, or be sampled with time-dependent birth, death, and sampling rates, generating phylogenetic trees. It has been shown that in some subclasses of such models, different sets of rates can result in the same distributions of reconstructed phylogenetic trees, and therefore the rates become unidentifiable from the trees regardless of their size. Here we show that widely used time-dependent fossilised birth-death (FBD) models are identifiable. This subclass of models makes more realistic assumptions about the fossilisation process and certain infectious disease transmission processes than the unidentifiable birth-death sampling models. Namely, FBD models assume that sampled lineages stay in the process rather than being immediately removed upon sampling. Identifiability of the time-dependent FBD model justifies using statistical methods that implement this model to infer the underlying temporal diversification or epidemiological dynamics from phylogenetic trees or directly from molecular or other comparative data. We further show that the time-dependent fossilised-birth-death model with an extra parameter, the removal after sampling probability, is unidentifiable. This implies that in scenarios where we do not know how sampling affects lineages we are unable to infer this extra parameter together with birth, death, and sampling rates solely from trees.
{"title":"The Fossilised Birth-Death Model is Identifiable.","authors":"Kate Truman, Timothy G Vaughan, Alex Gavryushkin, Alexandra Sasha Gavryushkina","doi":"10.1093/sysbio/syae058","DOIUrl":"10.1093/sysbio/syae058","url":null,"abstract":"<p><p>Time-dependent birth-death sampling models have been used in numerous studies for inferring past evolutionary dynamics in different biological contexts, e.g. speciation and extinction rates in macroevolutionary studies, or effective reproductive number in epidemiological studies. These models are branching processes where lineages can bifurcate, die, or be sampled with time-dependent birth, death, and sampling rates, generating phylogenetic trees. It has been shown that in some subclasses of such models, different sets of rates can result in the same distributions of reconstructed phylogenetic trees, and therefore the rates become unidentifiable from the trees regardless of their size. Here we show that widely used time-dependent fossilised birth-death (FBD) models are identifiable. This subclass of models makes more realistic assumptions about the fossilisation process and certain infectious disease transmission processes than the unidentifiable birth-death sampling models. Namely, FBD models assume that sampled lineages stay in the process rather than being immediately removed upon sampling. Identifiability of the time-dependent FBD model justifies using statistical methods that implement this model to infer the underlying temporal diversification or epidemiological dynamics from phylogenetic trees or directly from molecular or other comparative data. We further show that the time-dependent fossilised-birth-death model with an extra parameter, the removal after sampling probability, is unidentifiable. This implies that in scenarios where we do not know how sampling affects lineages we are unable to infer this extra parameter together with birth, death, and sampling rates solely from trees.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142475252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin S Toups, Robert C Thomson, Jeremy M Brown
Variation in gene tree estimates is widely observed in empirical phylogenomic data and is often assumed to be the result of biological processes. However, a recent study using tetrapod mitochondrial genomes to control for biological sources of variation due to their haploid, uniparentally inherited, and non-recombining nature found that levels of discordance among mitochondrial gene trees were comparable to those found in studies that assume only biological sources of variation. Additionally, they found that several of the models of sequence evolution chosen to infer gene trees were doing an inadequate job fitting the sequence data. These results indicated that significant amounts of gene tree discordance in empirical data may be due to poor fit of sequence evolution models, and that more complex and biologically realistic models may be needed. To test how the fit of sequence evolution models relates to gene tree discordance, we analyzed the same mitochondrial datasets as the previous study using two additional, more complex models of sequence evolution that each includes a different biologically realistic aspect of the evolutionary process: a covarion model to incorporate site-specific rate variation across lineages (heterotachy), and a partitioned model to incorporate variable evolutionary patterns by codon position. Our results show that both additional models fit the data better than the models used in the previous study, with the covarion being consistently and strongly preferred as tree size increases. However, even these more preferred models still inferred highly discordant mitochondrial gene trees, thus deepening the mystery around what we label the "Mito-Phylo Paradox" and leading us to ask whether the observed variation could, in fact, be biological in nature after all.
{"title":"Complex Models of Sequence Evolution Improve Fit, but not Gene Tree Discordance, for Tetrapod Mitogenomes.","authors":"Benjamin S Toups, Robert C Thomson, Jeremy M Brown","doi":"10.1093/sysbio/syae056","DOIUrl":"https://doi.org/10.1093/sysbio/syae056","url":null,"abstract":"<p><p>Variation in gene tree estimates is widely observed in empirical phylogenomic data and is often assumed to be the result of biological processes. However, a recent study using tetrapod mitochondrial genomes to control for biological sources of variation due to their haploid, uniparentally inherited, and non-recombining nature found that levels of discordance among mitochondrial gene trees were comparable to those found in studies that assume only biological sources of variation. Additionally, they found that several of the models of sequence evolution chosen to infer gene trees were doing an inadequate job fitting the sequence data. These results indicated that significant amounts of gene tree discordance in empirical data may be due to poor fit of sequence evolution models, and that more complex and biologically realistic models may be needed. To test how the fit of sequence evolution models relates to gene tree discordance, we analyzed the same mitochondrial datasets as the previous study using two additional, more complex models of sequence evolution that each includes a different biologically realistic aspect of the evolutionary process: a covarion model to incorporate site-specific rate variation across lineages (heterotachy), and a partitioned model to incorporate variable evolutionary patterns by codon position. Our results show that both additional models fit the data better than the models used in the previous study, with the covarion being consistently and strongly preferred as tree size increases. However, even these more preferred models still inferred highly discordant mitochondrial gene trees, thus deepening the mystery around what we label the \"Mito-Phylo Paradox\" and leading us to ask whether the observed variation could, in fact, be biological in nature after all.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142406814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}