Thèo Gaboriau, Joseph A Tobias, Daniele Silvestro, Nicolas Salamin
Popular comparative phylogenetic models such as Brownian Motion, Ornstein-Ulhenbeck, and their extensions, assume that, at speciation, a trait value is inherited identically by two descendant species. This assumption contrasts with models of speciation at a micro-evolutionary scale where descendants' phenotypic distributions are sub-samples of the ancestral distribution. Different speciation mechanisms can lead to a displacement of the ancestral phenotypic mean among descendants and an asymmetric inheritance of the ancestral phenotypic variance. In contrast, even macro-evolutionary models that account for intraspecific variance assume symmetrically conserved inheritance of ancestral phenotypic distribution at speciation. Here we develop an Asymmetric Brownian Motion model (ABM) that relaxes the assumption of symmetric and conserved inheritance of the ancestral distribution at the time of speciation. The ABM jointly models the evolution of both intra- and inter-specific phenotypic variation. It also infers the mode of phenotypic inheritance at speciation, which can range from a symmetric and conserved inheritance, where descendants inherit the ancestral distribution, to an asymmetric and displaced inheritance, where descendants inherit divergent phenotypic means and variances. To demonstrate this model, we analyze the evolution of beak morphology in Darwin finches, finding evidence of displacement at speciation. The ABM model helps to bridge micro- and macro-evolutionary models of trait evolution by providing a more robust framework for testing the effects of ecological speciation, character displacement, and niche partitioning on trait evolution at the macro-evolutionary scale.
{"title":"Exploring the Macroevolutionary Signature of Asymmetric Inheritance at Speciation.","authors":"Thèo Gaboriau, Joseph A Tobias, Daniele Silvestro, Nicolas Salamin","doi":"10.1093/sysbio/syae043","DOIUrl":"https://doi.org/10.1093/sysbio/syae043","url":null,"abstract":"<p><p>Popular comparative phylogenetic models such as Brownian Motion, Ornstein-Ulhenbeck, and their extensions, assume that, at speciation, a trait value is inherited identically by two descendant species. This assumption contrasts with models of speciation at a micro-evolutionary scale where descendants' phenotypic distributions are sub-samples of the ancestral distribution. Different speciation mechanisms can lead to a displacement of the ancestral phenotypic mean among descendants and an asymmetric inheritance of the ancestral phenotypic variance. In contrast, even macro-evolutionary models that account for intraspecific variance assume symmetrically conserved inheritance of ancestral phenotypic distribution at speciation. Here we develop an Asymmetric Brownian Motion model (ABM) that relaxes the assumption of symmetric and conserved inheritance of the ancestral distribution at the time of speciation. The ABM jointly models the evolution of both intra- and inter-specific phenotypic variation. It also infers the mode of phenotypic inheritance at speciation, which can range from a symmetric and conserved inheritance, where descendants inherit the ancestral distribution, to an asymmetric and displaced inheritance, where descendants inherit divergent phenotypic means and variances. To demonstrate this model, we analyze the evolution of beak morphology in Darwin finches, finding evidence of displacement at speciation. The ABM model helps to bridge micro- and macro-evolutionary models of trait evolution by providing a more robust framework for testing the effects of ecological speciation, character displacement, and niche partitioning on trait evolution at the macro-evolutionary scale.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141752781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin M Titus, H Lisle Gibbs, Nuno Simões, Marymegan Daly
Recent genomic analyses have highlighted the prevalence of speciation with gene flow in many taxa and have underscored the importance of accounting for these reticulate evolutionary processes when constructing species trees and generating parameter estimates. This is especially important for deepening our understanding of speciation in the sea where fast moving ocean currents, expanses of deep water, and periodic episodes of sea level rise and fall act as soft and temporary allopatric barriers that facilitate both divergence and secondary contact. Under these conditions, gene flow is not expected to cease completely while contemporary distributions are expected to differ from historical ones. Here we conduct range-wide sampling for Pederson's cleaner shrimp (Ancylomenes pedersoni), a species complex from the Greater Caribbean that contains three clearly delimited mitochondrial lineages with both allopatric and sympatric distributions. Using mtDNA barcodes and a genomic ddRADseq approach, we combine classic phylogenetic analyses with extensive topology testing and demographic modeling (10 site frequency replicates x 45 evolutionary models x 50 model simulations/replicate = 22,500 simulations) to test species boundaries and reconstruct the evolutionary history of what was expected to be a simple case study. Instead, our results indicate a history of allopatric divergence, secondary contact, introgression, and endemic hybrid speciation that we hypothesize was driven by the final closure of the Isthmus of Panama and the strengthening of the Gulf Stream Current ~3.5 million years ago. The history of this species complex recovered by model-based methods that allow reticulation differs from that recovered by standard phylogenetic analyses and is unexpected given contemporary distributions. The geologically and biologically meaningful insights gained by our model selection analyses illuminate what is likely a novel pathway of species formation not previously documented that resulted from one of the most biogeographically significant events in Earth's history.
最近的基因组分析突显了许多类群中基因流动的物种演化现象,并强调了在构建物种树和生成参数估计时考虑这些网状演化过程的重要性。在海洋中,快速移动的洋流、广阔的深水区以及周期性的海平面上升和下降成为软性和暂时性的同域屏障,促进了物种的分化和二次接触,这对于加深我们对海洋中物种分化的理解尤为重要。在这些条件下,基因流动预计不会完全停止,而当代分布预计会与历史分布有所不同。在这里,我们对佩德森对虾(Ancylomenes pedersoni)进行了全域采样,这是大加勒比海的一个物种群,包含三个界限清晰的线粒体系,既有同域分布,也有异域分布。利用 mtDNA 条形码和基因组 ddRADseq 方法,我们将经典的系统发育分析与广泛的拓扑测试和人口统计建模(10 个位点频率重复 x 45 个进化模型 x 50 个模型模拟/重复 = 22,500 次模拟)相结合,检验了物种边界,并重建了这一预期为简单案例研究的进化历史。相反,我们的研究结果表明,在距今约 350 万年前,巴拿马地峡的最终关闭和湾流的加强推动了异地分化、次生接触、引种和地方性杂交物种的形成。通过基于模型的方法(允许网状结构)复原的这一物种复合体的历史与标准系统发育分析复原的历史不同,而且从当代分布来看也出乎意料。我们的模型选择分析所获得的具有地质学和生物学意义的见解,阐明了地球历史上最重要的生物地理事件之一所导致的物种形成的新途径,这可能是以前没有记载的。
{"title":"Topology Testing and Demographic Modeling Illuminate a Novel Speciation Pathway in the Greater Caribbean Sea Following the Formation of the Isthmus of Panama.","authors":"Benjamin M Titus, H Lisle Gibbs, Nuno Simões, Marymegan Daly","doi":"10.1093/sysbio/syae045","DOIUrl":"https://doi.org/10.1093/sysbio/syae045","url":null,"abstract":"<p><p>Recent genomic analyses have highlighted the prevalence of speciation with gene flow in many taxa and have underscored the importance of accounting for these reticulate evolutionary processes when constructing species trees and generating parameter estimates. This is especially important for deepening our understanding of speciation in the sea where fast moving ocean currents, expanses of deep water, and periodic episodes of sea level rise and fall act as soft and temporary allopatric barriers that facilitate both divergence and secondary contact. Under these conditions, gene flow is not expected to cease completely while contemporary distributions are expected to differ from historical ones. Here we conduct range-wide sampling for Pederson's cleaner shrimp (Ancylomenes pedersoni), a species complex from the Greater Caribbean that contains three clearly delimited mitochondrial lineages with both allopatric and sympatric distributions. Using mtDNA barcodes and a genomic ddRADseq approach, we combine classic phylogenetic analyses with extensive topology testing and demographic modeling (10 site frequency replicates x 45 evolutionary models x 50 model simulations/replicate = 22,500 simulations) to test species boundaries and reconstruct the evolutionary history of what was expected to be a simple case study. Instead, our results indicate a history of allopatric divergence, secondary contact, introgression, and endemic hybrid speciation that we hypothesize was driven by the final closure of the Isthmus of Panama and the strengthening of the Gulf Stream Current ~3.5 million years ago. The history of this species complex recovered by model-based methods that allow reticulation differs from that recovered by standard phylogenetic analyses and is unexpected given contemporary distributions. The geologically and biologically meaningful insights gained by our model selection analyses illuminate what is likely a novel pathway of species formation not previously documented that resulted from one of the most biogeographically significant events in Earth's history.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141749074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Ren, Long Wang, Ze-Long Nie, Ming Tang, Gabriel Johnson, Hui-Tong Tan, Nian-He Xia, Jun Wen, Qin-Er Yang
Polyploidy is a significant mechanism in eukaryotic evolution and is particularly prevalent in the plant kingdom. However, our knowledge about this phenomenon and its effects on evolution remains limited. A major obstacle to the study of polyploidy is the great difficulty in untangling the origins of allopolyploids. Due to the drastic genome changes and the erosion of allopolyploidy signals caused by the combined effects of hybridization and complex post-polyploid diploidization processes, resolving the origins of allopolyploids has long been a challenging task. Here we revisit this issue with the interesting case of subtribe Tussilagininae (Asteraceae: Senecioneae) and by developing HomeoSorter, a new pipeline for network inferences by phasing homeologs to parental subgenomes. The pipeline is based on the basic idea of a previous study but with major changes to address the scaling problem and implement some new functions. With simulated data, we demonstrate that HomeoSorter works efficiently on genome-scale data and has high accuracy in identifying polyploid patterns and assigning homeologs. Using HomeoSorter, the maximum pseudo-likelihood model of Phylonet, and genome-scale data, we further address the complex origin of Tussilagininae, a speciose group (ca. 45 genera and 710 species) characterized by having high base chromosome numbers (mainly x = 30, 40). In particular, the inferred patterns are strongly supported by the chromosomal evidence. Tussilagininae is revealed to comprise two large groups with successive allopolyploid origins: Tussilagininae s.s. (mainly x = 30) and the Gynoxyoid group (x = 40). Two allopolyploidy events first give rise to Tussilagininae s.s., with the first event occurring between the ancestor of subtribe Senecioninae (x = 10) and a lineage (highly probably with x = 10) related to the Brachyglottis alliance, and the resulting hybrid lineage crossing with the ancestor of Chersodoma (x = 10) and leading to Tussilagininae s.s. Then, after early diversification, the Central American group (mainly x = 30) of Tussilagininae s.s., is involved in a third allopolyploidy event with, again, the Chersodoma lineage and produces the Gynoxyoid group. Our study highlights the value of HomeoSorter and the homeolog-sorting approach in polyploid phylogenetics. With rich species diversity and clear evolutionary patterns, Tussilagininae s.s. and the Gynoxyoid group are also excellent models for future investigations of polyploidy.
{"title":"Complex but Clear Allopolyploid Pattern of Subtribe Tussilagininae (Asteraceae: Senecioneae) Revealed by Robust Phylogenomic Evidence, with Development of a Novel Homeolog-Sorting Pipeline","authors":"Chen Ren, Long Wang, Ze-Long Nie, Ming Tang, Gabriel Johnson, Hui-Tong Tan, Nian-He Xia, Jun Wen, Qin-Er Yang","doi":"10.1093/sysbio/syae046","DOIUrl":"https://doi.org/10.1093/sysbio/syae046","url":null,"abstract":"Polyploidy is a significant mechanism in eukaryotic evolution and is particularly prevalent in the plant kingdom. However, our knowledge about this phenomenon and its effects on evolution remains limited. A major obstacle to the study of polyploidy is the great difficulty in untangling the origins of allopolyploids. Due to the drastic genome changes and the erosion of allopolyploidy signals caused by the combined effects of hybridization and complex post-polyploid diploidization processes, resolving the origins of allopolyploids has long been a challenging task. Here we revisit this issue with the interesting case of subtribe Tussilagininae (Asteraceae: Senecioneae) and by developing HomeoSorter, a new pipeline for network inferences by phasing homeologs to parental subgenomes. The pipeline is based on the basic idea of a previous study but with major changes to address the scaling problem and implement some new functions. With simulated data, we demonstrate that HomeoSorter works efficiently on genome-scale data and has high accuracy in identifying polyploid patterns and assigning homeologs. Using HomeoSorter, the maximum pseudo-likelihood model of Phylonet, and genome-scale data, we further address the complex origin of Tussilagininae, a speciose group (ca. 45 genera and 710 species) characterized by having high base chromosome numbers (mainly x = 30, 40). In particular, the inferred patterns are strongly supported by the chromosomal evidence. Tussilagininae is revealed to comprise two large groups with successive allopolyploid origins: Tussilagininae s.s. (mainly x = 30) and the Gynoxyoid group (x = 40). Two allopolyploidy events first give rise to Tussilagininae s.s., with the first event occurring between the ancestor of subtribe Senecioninae (x = 10) and a lineage (highly probably with x = 10) related to the Brachyglottis alliance, and the resulting hybrid lineage crossing with the ancestor of Chersodoma (x = 10) and leading to Tussilagininae s.s. Then, after early diversification, the Central American group (mainly x = 30) of Tussilagininae s.s., is involved in a third allopolyploidy event with, again, the Chersodoma lineage and produces the Gynoxyoid group. Our study highlights the value of HomeoSorter and the homeolog-sorting approach in polyploid phylogenetics. With rich species diversity and clear evolutionary patterns, Tussilagininae s.s. and the Gynoxyoid group are also excellent models for future investigations of polyploidy.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141755350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite their extensive diversity and ecological importance, the history of diversification for most groups of parasitic organisms remains relatively understudied. Elucidating broad macroevolutionary patterns of parasites is challenging, often limited by the availability of samples, genetic resources, and knowledge about ecological relationships with their hosts. In this study, we explore the macroevolutionary history of parasites by focusing on parasitic body lice from doves. Building on extensive knowledge of ecological relationships and previous phylogenomic studies of their avian hosts, we tested specific questions about the evolutionary origins of the body lice of doves, leveraging whole genome data sets for phylogenomics. Specifically, we sequenced whole genomes from 68 samples of dove body lice, including representatives of all body louse genera from 51 host taxa. From these data, we assembled >2,300 nuclear genes to estimate dated phylogenetic relationships among body lice and several outgroup taxa. The resulting phylogeny of body lice was well supported, although some branches had conflicting signal across the genome. We then reconstructed ancestral biogeographic ranges of body lice and compared the body louse phylogeny to phylogeny of doves, and also to a previously published phylogeny of the wing lice of doves. Divergence estimates placed the origin of body lice in the late Oligocene. Body lice likely originated in Australasia and dispersed with their hosts during the early Miocene, with subsequent codivergence and host switching throughout the world. Notably, this evolutionary history is very similar to that of dove wing lice, despite the stronger dispersal capabilities of wing lice compared to body lice. Our results highlight the central role of the biogeographic history of host organisms in driving the evolutionary history of their parasites across time and geographic space.
{"title":"Biogeographic history of pigeons and doves drives the origin and diversification of their parasitic body lice.","authors":"Andrew D Sweet, Jorge Doña, Kevin P Johnson","doi":"10.1093/sysbio/syae038","DOIUrl":"https://doi.org/10.1093/sysbio/syae038","url":null,"abstract":"<p><p>Despite their extensive diversity and ecological importance, the history of diversification for most groups of parasitic organisms remains relatively understudied. Elucidating broad macroevolutionary patterns of parasites is challenging, often limited by the availability of samples, genetic resources, and knowledge about ecological relationships with their hosts. In this study, we explore the macroevolutionary history of parasites by focusing on parasitic body lice from doves. Building on extensive knowledge of ecological relationships and previous phylogenomic studies of their avian hosts, we tested specific questions about the evolutionary origins of the body lice of doves, leveraging whole genome data sets for phylogenomics. Specifically, we sequenced whole genomes from 68 samples of dove body lice, including representatives of all body louse genera from 51 host taxa. From these data, we assembled >2,300 nuclear genes to estimate dated phylogenetic relationships among body lice and several outgroup taxa. The resulting phylogeny of body lice was well supported, although some branches had conflicting signal across the genome. We then reconstructed ancestral biogeographic ranges of body lice and compared the body louse phylogeny to phylogeny of doves, and also to a previously published phylogeny of the wing lice of doves. Divergence estimates placed the origin of body lice in the late Oligocene. Body lice likely originated in Australasia and dispersed with their hosts during the early Miocene, with subsequent codivergence and host switching throughout the world. Notably, this evolutionary history is very similar to that of dove wing lice, despite the stronger dispersal capabilities of wing lice compared to body lice. Our results highlight the central role of the biogeographic history of host organisms in driving the evolutionary history of their parasites across time and geographic space.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141734980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony J Barley, Adrián Nieto-Montes de Oca, Norma L Manríquez-Morán, Robert C Thomson
Gene flow between diverging lineages challenges the resolution of species boundaries and the understanding of evolutionary history in recent radiations. Here, we integrate phylogenetic and coalescent tools to resolve reticulate patterns of diversification and use a perspective focused on evolutionary mechanisms to distinguish interspecific and intraspecific taxonomic variation. We use this approach to resolve the systematics for one of the most intensively studied but difficult to understand groups of reptiles: the spotted whiptail lizards of the genus Aspidoscelis (A. gularis complex). Whiptails contain the largest number of unisexual species known within any vertebrate group and the spotted whiptail complex has played a key role in the generation of this diversity through hybrid speciation. Understanding lineage boundaries and the evolutionary history of divergence and reticulation within this group is therefore key to understanding the generation of unisexual diversity in whiptails. Despite this importance, long-standing confusion about their systematics has impeded understanding of which gonochoristic species have contributed to the formation of unisexual lineages. Using reduced representation genomic data, we resolve patterns of divergence and gene flow within the spotted whiptails and clarify patterns of hybrid speciation. We find evidence that biogeographically structured ecological and environmental variation has been important in morphological and genetic diversification, as well as the maintenance of species boundaries in this system. Our study elucidates how gene flow among lineages and the continuous nature of speciation can bias the practice of species delimitation and lead taxonomists operating under different frameworks to different conclusions (here we propose that a two species arrangement best reflects our current understanding). In doing so, this study provides conceptual and methodological insights into approaches to resolving diversification patterns and species boundaries in rapid radiations with complex histories, as well as long-standing taxonomic challenges in the field of systematic biology.
{"title":"Understanding Species Boundaries that Arise from Complex Histories: Gene Flow Across the Speciation Continuum in the Spotted Whiptail Lizards.","authors":"Anthony J Barley, Adrián Nieto-Montes de Oca, Norma L Manríquez-Morán, Robert C Thomson","doi":"10.1093/sysbio/syae040","DOIUrl":"https://doi.org/10.1093/sysbio/syae040","url":null,"abstract":"<p><p>Gene flow between diverging lineages challenges the resolution of species boundaries and the understanding of evolutionary history in recent radiations. Here, we integrate phylogenetic and coalescent tools to resolve reticulate patterns of diversification and use a perspective focused on evolutionary mechanisms to distinguish interspecific and intraspecific taxonomic variation. We use this approach to resolve the systematics for one of the most intensively studied but difficult to understand groups of reptiles: the spotted whiptail lizards of the genus Aspidoscelis (A. gularis complex). Whiptails contain the largest number of unisexual species known within any vertebrate group and the spotted whiptail complex has played a key role in the generation of this diversity through hybrid speciation. Understanding lineage boundaries and the evolutionary history of divergence and reticulation within this group is therefore key to understanding the generation of unisexual diversity in whiptails. Despite this importance, long-standing confusion about their systematics has impeded understanding of which gonochoristic species have contributed to the formation of unisexual lineages. Using reduced representation genomic data, we resolve patterns of divergence and gene flow within the spotted whiptails and clarify patterns of hybrid speciation. We find evidence that biogeographically structured ecological and environmental variation has been important in morphological and genetic diversification, as well as the maintenance of species boundaries in this system. Our study elucidates how gene flow among lineages and the continuous nature of speciation can bias the practice of species delimitation and lead taxonomists operating under different frameworks to different conclusions (here we propose that a two species arrangement best reflects our current understanding). In doing so, this study provides conceptual and methodological insights into approaches to resolving diversification patterns and species boundaries in rapid radiations with complex histories, as well as long-standing taxonomic challenges in the field of systematic biology.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141634644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chloroplast capture, a phenomenon that can occur through interspecific hybridization and introgression, is frequently invoked to explain cytonuclear discordance in plants. However, relatively few studies have documented the mechanisms of cytonuclear coevolution and its potential for driving species differentiation and possible functional differences in the context of chloroplast capture. To address this crucial question, we chose the Aquilegia genus, which is known for having minimal sterility among species, and inferred that A. amurensis captured the plastome of A. parviflora based on cytonuclear discordance and gene flow between the two species. We focused on the introgression region and its differentiation from corresponding regions in closely related species, especially its composition in a chloroplast capture scenario. We found that nuclear genes encoding cytonuclear enzyme complexes (CECs; i.e., organelle-targeted genes) of chloroplast donor species were selectively retained and displaced the original CEC genes in chloroplast-receiving species due to cytonuclear interactions during introgression. Notably, the intrinsic correlation of CEC introgression was a greater degree of evolutionary distance for these CECs between A. amurensis and A. parviflora. Terpene synthase activity genes (GO: 0010333) were overrepresented among the introgressed genes, and more than 30% of these genes were CEC genes. These findings support our observations that floral terpene release pattern is similar between A. amurensis and A. parviflora compared with A. japonica. Our study clarifies the mechanisms of cytonuclear coevolution, species differentiation and functional differences in the context of chloroplast capture and highlights the potential role of chloroplast capture in adaptation.
叶绿体捕获是一种可通过种间杂交和引种发生的现象,经常被用来解释植物细胞核的不一致性。然而,相对较少的研究记录了细胞核共同进化的机制及其在叶绿体捕获背景下驱动物种分化和可能的功能差异的潜力。为了解决这一关键问题,我们选择了以物种间极少不育而闻名的水仙属,并根据细胞核不一致性和两个物种间的基因流推断出 A. amurensis 捕获了 A. parviflora 的质体。我们重点研究了导入区及其与近缘物种相应区域的差异,尤其是叶绿体捕获情景下的构成。我们发现,叶绿体供体物种中编码细胞核酶复合物(CECs;即细胞器靶向基因)的核基因被选择性地保留下来,并在叶绿体接受物种中由于引种过程中的细胞核相互作用而取代了原有的 CEC 基因。值得注意的是,A. amurensis 和 A. parviflora 之间这些 CEC 基因的进化距离更远,这与 CEC 基因导入的内在相关性有关。萜烯合成酶活性基因(GO:0010333)在引种基因中的比例较高,其中 30% 以上是 CEC 基因。这些发现支持了我们的观察,即与 A. japonica 相比,A. amurensis 和 A. parviflora 的花萜烯释放模式相似。我们的研究阐明了叶绿体捕获背景下细胞核协同进化、物种分化和功能差异的机制,并强调了叶绿体捕获在适应过程中的潜在作用。
{"title":"Biased gene introgression and adaptation in the face of chloroplast capture in Aquilegia amurensis.","authors":"Huaying Wang, Wei Zhang, Yanan Yu, Xiaoxue Fang, Tengjiao Zhang, Luyuan Xu, Lei Gong, Hongxing Xiao","doi":"10.1093/sysbio/syae039","DOIUrl":"https://doi.org/10.1093/sysbio/syae039","url":null,"abstract":"<p><p>Chloroplast capture, a phenomenon that can occur through interspecific hybridization and introgression, is frequently invoked to explain cytonuclear discordance in plants. However, relatively few studies have documented the mechanisms of cytonuclear coevolution and its potential for driving species differentiation and possible functional differences in the context of chloroplast capture. To address this crucial question, we chose the Aquilegia genus, which is known for having minimal sterility among species, and inferred that A. amurensis captured the plastome of A. parviflora based on cytonuclear discordance and gene flow between the two species. We focused on the introgression region and its differentiation from corresponding regions in closely related species, especially its composition in a chloroplast capture scenario. We found that nuclear genes encoding cytonuclear enzyme complexes (CECs; i.e., organelle-targeted genes) of chloroplast donor species were selectively retained and displaced the original CEC genes in chloroplast-receiving species due to cytonuclear interactions during introgression. Notably, the intrinsic correlation of CEC introgression was a greater degree of evolutionary distance for these CECs between A. amurensis and A. parviflora. Terpene synthase activity genes (GO: 0010333) were overrepresented among the introgressed genes, and more than 30% of these genes were CEC genes. These findings support our observations that floral terpene release pattern is similar between A. amurensis and A. parviflora compared with A. japonica. Our study clarifies the mechanisms of cytonuclear coevolution, species differentiation and functional differences in the context of chloroplast capture and highlights the potential role of chloroplast capture in adaptation.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141601838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danielle K Herrig, Ryan D Ridenbaugh, Kim L Vertacnik, Kathryn M Everson, Sheina B Sim, Scott M Geib, David W Weisrock, Catherine R Linnen
Rapidly evolving taxa are excellent models for understanding the mechanisms that give rise to biodiversity. However, developing an accurate historical framework for comparative analysis of such lineages remains a challenge due to ubiquitous incomplete lineage sorting and introgression. Here, we use a whole-genome alignment, multiple locus-sampling strategies, and summary-tree and SNP-based species-tree methods to infer a species tree for eastern North American Neodiprion species, a clade of pine-feeding sawflies (Order: Hymenopteran; Family: Diprionidae). We recovered a well-supported species tree that-except for three uncertain relationships-was robust to different strategies for analyzing whole-genome data. Nevertheless, underlying gene-tree discordance was high. To understand this genealogical variation, we used multiple linear regression to model site concordance factors estimated in 50-kb windows as a function of several genomic predictor variables. We found that site concordance factors tended to be higher in regions of the genome with more parsimony-informative sites, fewer singletons, less missing data, lower GC content, more genes, lower recombination rates, and lower D-statistics (less introgression). Together, these results suggest that incomplete lineage sorting, introgression, and genotyping error all shape the genomic landscape of gene-tree discordance in Neodiprion. More generally, our findings demonstrate how combining phylogenomic analysis with knowledge of local genomic features can reveal mechanisms that produce topological heterogeneity across genomes.
快速进化的类群是了解生物多样性产生机制的绝佳模型。然而,由于无处不在的不完全世系分类和引入,为这类世系的比较分析建立一个准确的历史框架仍然是一个挑战。在本文中,我们使用全基因组比对、多位点取样策略以及基于总结树和 SNP 的物种树方法来推断北美东部 Neodiprion 物种的物种树,这是一个食松锯蝇支系(目:膜翅目;科:双翅目)。我们恢复了一个支持良好的物种树,除了三个不确定的关系外,该物种树对不同的全基因组数据分析策略都很稳健。然而,潜在基因树的不一致性很高。为了了解这种谱系变异,我们使用多元线性回归方法,将 50-kb 窗口中估计的位点一致性因子作为几个基因组预测变量的函数来建模。我们发现,在基因组中具有更多解析信息的位点、更少的单子、更少的缺失数据、更低的 GC 含量、更多的基因、更低的重组率和更低的 D 统计量(更少的引入)的区域,位点一致性系数往往更高。这些结果表明,不完全的世系分选、引入和基因分型错误都会造成新地鸟基因组中基因树不一致的情况。更广泛地说,我们的研究结果证明了如何将系统发生组分析与对局部基因组特征的了解相结合,从而揭示产生跨基因组拓扑异质性的机制。
{"title":"Whole Genomes Reveal Evolutionary Relationships and Mechanisms Underlying Gene-Tree Discordance in Neodiprion Sawflies.","authors":"Danielle K Herrig, Ryan D Ridenbaugh, Kim L Vertacnik, Kathryn M Everson, Sheina B Sim, Scott M Geib, David W Weisrock, Catherine R Linnen","doi":"10.1093/sysbio/syae036","DOIUrl":"https://doi.org/10.1093/sysbio/syae036","url":null,"abstract":"<p><p>Rapidly evolving taxa are excellent models for understanding the mechanisms that give rise to biodiversity. However, developing an accurate historical framework for comparative analysis of such lineages remains a challenge due to ubiquitous incomplete lineage sorting and introgression. Here, we use a whole-genome alignment, multiple locus-sampling strategies, and summary-tree and SNP-based species-tree methods to infer a species tree for eastern North American Neodiprion species, a clade of pine-feeding sawflies (Order: Hymenopteran; Family: Diprionidae). We recovered a well-supported species tree that-except for three uncertain relationships-was robust to different strategies for analyzing whole-genome data. Nevertheless, underlying gene-tree discordance was high. To understand this genealogical variation, we used multiple linear regression to model site concordance factors estimated in 50-kb windows as a function of several genomic predictor variables. We found that site concordance factors tended to be higher in regions of the genome with more parsimony-informative sites, fewer singletons, less missing data, lower GC content, more genes, lower recombination rates, and lower D-statistics (less introgression). Together, these results suggest that incomplete lineage sorting, introgression, and genotyping error all shape the genomic landscape of gene-tree discordance in Neodiprion. More generally, our findings demonstrate how combining phylogenomic analysis with knowledge of local genomic features can reveal mechanisms that produce topological heterogeneity across genomes.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141545293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philipp Mitteroecker, Michael L Collyer, Dean C Adams
Due to the hierarchical structure of the tree of life, closely related species often resemble each other more than distantly related species; a pattern termed phylogenetic signal. Numerous univariate statistics have been proposed as measures of phylogenetic signal for single phenotypic traits, but the study of phylogenetic signal for multivariate data, as is common in modern biology, remains challenging. Here we introduce a new method to explore phylogenetic signal in multivariate phenotypes. Our approach decomposes the data into linear combinations with maximal (or minimal) phylogenetic signal, as measured by Blomberg's K. The loading vectors of these phylogenetic components or K-components can be biologically interpreted, and scatterplots of the scores can be used as a low-dimensional ordination of the data that maximally (or minimally) preserves phylogenetic signal. We present algebraic and statistical properties, along with two new summary statistics, KA and KG, of phylogenetic signal in multivariate data. Simulation studies showed that KA and KG have higher statistical power than the previously suggested statistic Kmult, especially if phylogenetic signal is low or concentrated in a few trait dimensions. In two empirical applications to vertebrate cranial shape (crocodyliforms and papionins), we found statistically significant phylogenetic signal concentrated in a few trait dimensions. The finding that phylogenetic signal can be highly variable across the dimensions of multivariate phenotypes has important implications for current maximum likelihood approaches to phylogenetic signal in multivariate data.
由于生命树的层次结构,近缘物种往往比远缘物种更相似;这种模式被称为系统发生信号。人们提出了许多单变量统计量来衡量单一表型性状的系统发生信号,但对于现代生物学中常见的多变量数据的系统发生信号研究仍具有挑战性。在此,我们介绍一种探索多元表型系统发生信号的新方法。我们的方法将数据分解成具有最大(或最小)系统发生信号的线性组合,以布隆伯格 K 值衡量。这些系统发生成分或 K 成分的载荷向量可以从生物学角度进行解释,分数的散点图可以用作数据的低维排序,从而最大(或最小)地保留系统发生信号。我们介绍了多元数据中系统发生信号的代数和统计特性,以及两个新的汇总统计量 KA 和 KG。模拟研究表明,KA 和 KG 比之前建议的统计量 Kmult 具有更高的统计能力,尤其是当系统发生信号较低或集中在几个性状维度时。在对脊椎动物颅骨形状(鳄形目和乳头状目)的两个经验应用中,我们发现具有统计意义的系统发生信号集中在几个性状维度上。系统发生学信号在多变量表型的各个维度上都可能存在很大的差异,这一发现对目前在多变量数据中系统发生学信号的最大似然法有重要影响。
{"title":"Exploring Phylogenetic Signal in Multivariate Phenotypes by Maximizing Blomberg's K.","authors":"Philipp Mitteroecker, Michael L Collyer, Dean C Adams","doi":"10.1093/sysbio/syae035","DOIUrl":"https://doi.org/10.1093/sysbio/syae035","url":null,"abstract":"<p><p>Due to the hierarchical structure of the tree of life, closely related species often resemble each other more than distantly related species; a pattern termed phylogenetic signal. Numerous univariate statistics have been proposed as measures of phylogenetic signal for single phenotypic traits, but the study of phylogenetic signal for multivariate data, as is common in modern biology, remains challenging. Here we introduce a new method to explore phylogenetic signal in multivariate phenotypes. Our approach decomposes the data into linear combinations with maximal (or minimal) phylogenetic signal, as measured by Blomberg's K. The loading vectors of these phylogenetic components or K-components can be biologically interpreted, and scatterplots of the scores can be used as a low-dimensional ordination of the data that maximally (or minimally) preserves phylogenetic signal. We present algebraic and statistical properties, along with two new summary statistics, KA and KG, of phylogenetic signal in multivariate data. Simulation studies showed that KA and KG have higher statistical power than the previously suggested statistic Kmult, especially if phylogenetic signal is low or concentrated in a few trait dimensions. In two empirical applications to vertebrate cranial shape (crocodyliforms and papionins), we found statistically significant phylogenetic signal concentrated in a few trait dimensions. The finding that phylogenetic signal can be highly variable across the dimensions of multivariate phenotypes has important implications for current maximum likelihood approaches to phylogenetic signal in multivariate data.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141545292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dating phylogenetic trees to obtain branch lengths in time unit is essential for many downstream applications but has remained challenging. Dating requires inferring substitution rates that can change across the tree. While we can assume to have information about a small subset of nodes from the fossil record or sampling times (for fast-evolving organisms), inferring the ages of the other nodes essentially requires extrapolation and interpolation. Assuming a distribution of branch rates, we can formulate dating as a constrained maximum likelihood (ML) estimation problem. While ML dating methods exist, their accuracy degrades in the face of model misspecification where the assumed parametric statistical distribution of branch rates vastly differs from the true distribution. Notably, most existing methods assume rigid, often unimodal, branch rate distributions. A second challenge is that the likelihood function involves an integral over the continuous domain of the rates and often leads to difficult non-convex optimization problems. To tackle these two challenges, we propose a new method called Molecular Dating using Categorical-models (MD-Cat). MD-Cat uses a categorical model of rates inspired by non-parametric statistics and can approximate a large family of models by discretizing the rate distribution into k categories. Under this model, we can use the Expectation- Maximization (EM) algorithm to co-estimate rate categories and branch lengths in time units. Our model has fewer assumptions about the true distribution of branch rates than parametric models such as Gamma or LogNormal distribution. Our results on two simulated and real datasets of Angiosperms and HIV and a wide selection of rate distributions show that MD-Cat is often more accurate than the alternatives, especially on datasets with exponential or multimodal rate distributions.
对系统发生树进行定年以获得时间单位的分支长度对许多下游应用都是至关重要的,但仍然具有挑战性。确定系统发生树的年代需要推断整个系统发生树中可能发生变化的替代率。虽然我们可以假设从化石记录或取样时间(对于快速进化的生物)中获得了一小部分节点的信息,但推断其他节点的年龄基本上需要外推法和内插法。假设分支率的分布情况,我们可以将年代测定表述为一个受约束的最大似然(ML)估计问题。虽然存在最大似然法测年方法,但其准确性会因模型失当而降低,因为在模型失当的情况下,假定的分支率参数统计分布与真实分布相差甚远。值得注意的是,大多数现有方法都假设了僵化的、通常是单模态的分支率分布。第二个挑战是,似然函数涉及对比率连续域的积分,通常会导致困难的非凸优化问题。为了解决这两个难题,我们提出了一种名为 "使用分类模型的分子约会"(MD-Cat)的新方法。MD-Cat 采用了一种受非参数统计启发的速率分类模型,通过将速率分布离散为 k 个类别,可以近似大量的模型族。在此模型下,我们可以使用期望最大化(EM)算法来共同估算速率类别和以时间为单位的分支长度。与伽马分布或对数正态分布等参数模型相比,我们的模型对分支率真实分布的假设更少。我们在 Angiosperms 和 HIV 两个模拟和真实数据集以及多种速率分布选择上的结果表明,MD-Cat 通常比其他方法更准确,尤其是在指数或多模态速率分布的数据集上。
{"title":"Expectation-Maximization enables Phylogenetic Dating under a Categorical Rate Model.","authors":"Uyen Mai, Eduardo Charvel, Siavash Mirarab","doi":"10.1093/sysbio/syae034","DOIUrl":"10.1093/sysbio/syae034","url":null,"abstract":"<p><p>Dating phylogenetic trees to obtain branch lengths in time unit is essential for many downstream applications but has remained challenging. Dating requires inferring substitution rates that can change across the tree. While we can assume to have information about a small subset of nodes from the fossil record or sampling times (for fast-evolving organisms), inferring the ages of the other nodes essentially requires extrapolation and interpolation. Assuming a distribution of branch rates, we can formulate dating as a constrained maximum likelihood (ML) estimation problem. While ML dating methods exist, their accuracy degrades in the face of model misspecification where the assumed parametric statistical distribution of branch rates vastly differs from the true distribution. Notably, most existing methods assume rigid, often unimodal, branch rate distributions. A second challenge is that the likelihood function involves an integral over the continuous domain of the rates and often leads to difficult non-convex optimization problems. To tackle these two challenges, we propose a new method called Molecular Dating using Categorical-models (MD-Cat). MD-Cat uses a categorical model of rates inspired by non-parametric statistics and can approximate a large family of models by discretizing the rate distribution into k categories. Under this model, we can use the Expectation- Maximization (EM) algorithm to co-estimate rate categories and branch lengths in time units. Our model has fewer assumptions about the true distribution of branch rates than parametric models such as Gamma or LogNormal distribution. Our results on two simulated and real datasets of Angiosperms and HIV and a wide selection of rate distributions show that MD-Cat is often more accurate than the alternatives, especially on datasets with exponential or multimodal rate distributions.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141545291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Basanta Khakurel, Courtney Grigsby, Tyler D Tran, Juned Zariwala, Sebastian Höhna, April M Wright
Phylogenetic trees establish a historical context for the study of organismal form and function. Most phylogenetic trees are estimated using a model of evolution. For molecular data, modeling evolution is often based on biochemical observations about changes between character states. For example, there are four nucleotides, and we can make assumptions about the probability of transitions between them. By contrast, for morphological characters, we may not know a priori how many characters states there are per character, as both extant sampling and the fossil record may be highly incomplete, which leads to an observer bias. For a given character, the state space may be larger than what has been observed in the sample of taxa collected by the researcher. In this case, how many evolutionary rates are needed to even describe transitions between morphological character states may not be clear, potentially leading to model misspecification. To explore the impact of this model misspecification, we simulated character data with varying numbers of character states per character. We then used the data to estimate phylogenetic trees using models of evolution with the correct number of character states and an incorrect number of character states. The results of this study indicate that this observer bias may lead to phylogenetic error, particularly in the branch lengths of trees. If the state space is wrongly assumed to be too large, then we underestimate the branch lengths, and the opposite occurs when the state space is wrongly assumed to be too small.
{"title":"The fundamental role of character coding in Bayesian morphological phylogenetics.","authors":"Basanta Khakurel, Courtney Grigsby, Tyler D Tran, Juned Zariwala, Sebastian Höhna, April M Wright","doi":"10.1093/sysbio/syae033","DOIUrl":"https://doi.org/10.1093/sysbio/syae033","url":null,"abstract":"<p><p>Phylogenetic trees establish a historical context for the study of organismal form and function. Most phylogenetic trees are estimated using a model of evolution. For molecular data, modeling evolution is often based on biochemical observations about changes between character states. For example, there are four nucleotides, and we can make assumptions about the probability of transitions between them. By contrast, for morphological characters, we may not know a priori how many characters states there are per character, as both extant sampling and the fossil record may be highly incomplete, which leads to an observer bias. For a given character, the state space may be larger than what has been observed in the sample of taxa collected by the researcher. In this case, how many evolutionary rates are needed to even describe transitions between morphological character states may not be clear, potentially leading to model misspecification. To explore the impact of this model misspecification, we simulated character data with varying numbers of character states per character. We then used the data to estimate phylogenetic trees using models of evolution with the correct number of character states and an incorrect number of character states. The results of this study indicate that this observer bias may lead to phylogenetic error, particularly in the branch lengths of trees. If the state space is wrongly assumed to be too large, then we underestimate the branch lengths, and the opposite occurs when the state space is wrongly assumed to be too small.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141535331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}