首页 > 最新文献

Systematic Biology最新文献

英文 中文
Sequential Bayesian Phylogenetic Inference. 序列贝叶斯系统发育推论
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-21 DOI: 10.1093/sysbio/syae020
Sebastian Höhna, Allison Y Hsiang

The ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost. Instead, phylogenetic pipelines generally consist of sequential analyses, whereby a single point estimate from a given analysis is used as input for the next analysis (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step, which can lead to inaccurate or spuriously confident results. Here, we formally develop and test a sequential inference approach for Bayesian phylogenetic inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior distribution produced in the previous step. Our sequential inference approach presented here not only accounts for uncertainty between analysis steps, but also allows for greater flexibility in software choice (and hence model availability) and can be computationally more efficient than the traditional joint inference approach when multiple models are being tested. We show that our sequential inference approach is identical in practice to the joint inference approach only if sufficient information in the data is present (a narrow posterior distribution) and/or sufficiently many importance samples are used. Conversely, we show that the common practice of using a single point estimate can be biased, e.g., a single phylogeny estimate to transform an unrooted phylogeny into a time-calibrated phylogeny. We demonstrate the theory of sequential Bayesian inference using both a toy example and an empirical case study of divergence-time estimation in insects using a relaxed clock model from transcriptome data. In the empirical example, we estimate three posterior distributions of branch lengths from the same data (DNA character matrix with a GTR+Γ+I substitution model, an amino acid data matrix with empirical substitution models, and an amino acid data matrix with the PhyloBayes CAT-GTR model). Finally, we apply three different node-calibration strategies and show that divergence-time estimates are affected by both the data source and underlying substitution process to estimate branch lengths as well as the node-calibration strategies. Thus, our new sequential Bayesian phylogenetic inference provides the opportunity to efficiently test different approaches for divergence time estimation, including branch-length estimation from other software.

贝叶斯系统发育推断的理想方法是在单一分层模型中联合估计所有相关参数。然而,由于计算成本较高,这在实践中往往并不可行。取而代之的是,系统发育管道一般由连续分析组成,即把给定分析中的单点估计值作为下一步分析的输入(例如,用单个多序列比对来估计基因树)。在这个框架中,不确定性不会从一个步骤传播到另一个步骤,这可能导致不准确或虚假的可信结果。在这里,我们正式开发并测试了一种贝叶斯系统发育推断的顺序推断方法,该方法使用重要性采样从上一步产生的后验分布中为下一步分析流水线生成观测值。我们在此介绍的顺序推断方法不仅考虑了分析步骤之间的不确定性,而且在软件选择(从而模型可用性)方面具有更大的灵活性,并且在测试多个模型时比传统的联合推断方法计算效率更高。我们的研究表明,只有当数据中存在足够的信息(窄后验分布)和/或使用了足够多的重要性样本时,我们的顺序推断方法在实践中才与联合推断方法相同。相反,我们证明了使用单点估计的常见做法可能存在偏差,例如,使用单个系统发育估计将未根系统发育转化为时间校准系统发育。我们通过一个玩具示例和一个实证案例研究证明了序列贝叶斯推断理论,即利用转录组数据中的松弛时钟模型对昆虫的分化时间进行估计。在经验示例中,我们从相同的数据(采用 GTR+Γ+I 替代模型的 DNA 特征矩阵、采用经验替代模型的氨基酸数据矩阵和采用 PhyloBayes CAT-GTR 模型的氨基酸数据矩阵)中估计了三个分支长度的后验分布。最后,我们应用了三种不同的节点校准策略,结果表明分歧时间估计值既受数据源和基础替代过程的影响,也受估计分支长度的节点校准策略的影响。因此,我们新的序列贝叶斯系统发育推断方法为有效测试不同的分歧时间估计方法(包括其他软件的分支长度估计方法)提供了机会。
{"title":"Sequential Bayesian Phylogenetic Inference.","authors":"Sebastian Höhna, Allison Y Hsiang","doi":"10.1093/sysbio/syae020","DOIUrl":"https://doi.org/10.1093/sysbio/syae020","url":null,"abstract":"<p><p>The ideal approach to Bayesian phylogenetic inference is to estimate all parameters of interest jointly in a single hierarchical model. However, this is often not feasible in practice due to the high computational cost. Instead, phylogenetic pipelines generally consist of sequential analyses, whereby a single point estimate from a given analysis is used as input for the next analysis (e.g., a single multiple sequence alignment is used to estimate a gene tree). In this framework, uncertainty is not propagated from step to step, which can lead to inaccurate or spuriously confident results. Here, we formally develop and test a sequential inference approach for Bayesian phylogenetic inference, which uses importance sampling to generate observations for the next step of an analysis pipeline from the posterior distribution produced in the previous step. Our sequential inference approach presented here not only accounts for uncertainty between analysis steps, but also allows for greater flexibility in software choice (and hence model availability) and can be computationally more efficient than the traditional joint inference approach when multiple models are being tested. We show that our sequential inference approach is identical in practice to the joint inference approach only if sufficient information in the data is present (a narrow posterior distribution) and/or sufficiently many importance samples are used. Conversely, we show that the common practice of using a single point estimate can be biased, e.g., a single phylogeny estimate to transform an unrooted phylogeny into a time-calibrated phylogeny. We demonstrate the theory of sequential Bayesian inference using both a toy example and an empirical case study of divergence-time estimation in insects using a relaxed clock model from transcriptome data. In the empirical example, we estimate three posterior distributions of branch lengths from the same data (DNA character matrix with a GTR+Γ+I substitution model, an amino acid data matrix with empirical substitution models, and an amino acid data matrix with the PhyloBayes CAT-GTR model). Finally, we apply three different node-calibration strategies and show that divergence-time estimates are affected by both the data source and underlying substitution process to estimate branch lengths as well as the node-calibration strategies. Thus, our new sequential Bayesian phylogenetic inference provides the opportunity to efficiently test different approaches for divergence time estimation, including branch-length estimation from other software.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141071866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylogenomics, Lineage Diversification Rates, and the Evolution of Diadromy in Clupeiformes (Anchovies, Herrings, Sardines, and Relatives) 系统发生组学、品系分化率和鲈形目(鳀鱼、黑线鳕、沙丁鱼和近缘鱼类)的洄游演化
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-17 DOI: 10.1093/sysbio/syae022
Joshua P Egan, Andrew M Simons, Mohammad Sadegh Alavi-Yeganeh, Michael P Hammer, Prasert Tongnunui, Dahiana Arcila, Ricardo Betancur-R, Devin D Bloom
Migration independently evolved numerous times in animals, with a myriad of ecological and evolutionary implications. In fishes, perhaps the most extreme form of migration is diadromy, the migration between marine and freshwater environments. Key and longstanding questions are: how many times has diadromy evolved in fishes, how frequently do diadromous clades give rise to non-diadromous species, and does diadromy influence lineage diversification rates? Many diadromous fishes have large geographic ranges with constituent populations that use isolated freshwater habitats. This may limit gene flow among some populations, increasing the likelihood of speciation in diadromous lineages relative to non-diadromous lineages. Alternatively, diadromy may reduce lineage diversification rates if migration is associated with enhanced dispersal capacity that facilitates gene flow within and between populations. Clupeiformes (herrings, sardines, shads and anchovies) is a model clade for testing hypotheses about the evolution of diadromy because it includes an exceptionally high proportion of diadromous species and several independent evolutionary origins of diadromy. However, relationships among major clupeiform lineages remain unresolved and existing phylogenies sparsely sampled diadromous species, limiting the resolution of phylogenetically-informed statistical analyses. We assembled a phylogenomic dataset and used multi-species coalescent and concatenation-based approaches to generate the most comprehensive, highly-resolved clupeiform phylogeny to date, clarifying associations among several major clades and identifying recalcitrant relationships needing further examination. We determined that variation in rates of sequence evolution (heterotachy) and base-composition (non-stationarity) had little impact on our results. Using this phylogeny, we characterized evolutionary patterns of diadromy and tested for differences in lineage diversification rates between diadromous, marine, and freshwater lineages. We identified thirteen transitions to diadromy, all during the Cenozoic Era (ten origins of anadromy, two origins of catadromy, and one origin of amphidromy), and seven losses of diadromy. Two diadromous lineages rapidly generated non-diadromous species, demonstrating that diadromy is not an evolutionary dead-end. We discovered considerably faster transition rates out of diadromy than to diadromy. The largest lineage diversification rate increase in Clupeiformes was associated with a transition to diadromy, but we uncovered little statistical support for categorically faster lineage diversification rates in diadromous versus non-diadromous fishes. We propose that diadromy may increase the potential for accelerated lineage diversification, particularly in species that migrate long distances. However, this potential may only be realized in certain biogeographic contexts, such as when diadromy allows access to ecosystems in which there is limited competition from
洄游在动物中独立进化了无数次,对生态和进化产生了无数影响。在鱼类中,最极端的洄游形式可能是溯河洄游,即在海洋和淡水环境之间洄游。长期存在的关键问题是:洄游在鱼类中进化了多少次,洄游支系产生非洄游物种的频率有多高,洄游是否会影响鱼系的分化率?许多溯河鱼类的地理范围很大,其组成种群使用孤立的淡水生境。这可能会限制某些种群之间的基因流动,从而增加溯河鱼类种系相对于非溯河鱼类种系的物种分化的可能性。或者,如果洄游与提高扩散能力有关,从而促进种群内部和种群之间的基因流动,那么洄游可能会降低种系的多样化率。鯡形目(鯡魚、沙丁魚、鲥魚和鯷魚)是測試溯河魚類演化假說的典範支系,因為該支系包括極高比例的溯河物種和多個獨立的溯河演化起源。然而,主要褐藻种系之间的关系仍未得到解决,而且现有的系统发生对溯河物种取样稀少,限制了系统发生统计分析的分辨率。我们收集了一个系统发生组数据集,并使用基于多物种聚合和连接的方法生成了迄今为止最全面、分辨率最高的栉水母系统发生,澄清了几个主要支系之间的关联,并确定了需要进一步研究的难以解决的关系。我们发现,序列进化速度(异型进化)和碱基组成(非稳态)的变化对我们的研究结果影响不大。利用这一系统发育,我们描述了溯河动物的进化模式,并检验了溯河动物、海洋动物和淡水动物之间种系分化率的差异。我们发现了 13 种向双向洄游的转变,均发生在新生代(10 种起源于单向洄游,2 种起源于双向洄游,1 种起源于双栖洄游),以及 7 种双向洄游的消失。两个溯河物种系迅速产生了非溯河物种,这表明溯河并非进化的死胡同。我们发现,脱离二向洄游的速度比进入二向洄游的速度要快得多。在鳞鱼类中,最大的品系分化率增长与向非洄游过渡有关,但我们几乎没有发现统计数字支持洄游鱼类与非洄游鱼类的品系分化率在分类上更快。我们认为,蓑鲉可能会增加鱼系加速分化的潜力,特别是在长距离洄游的物种中。然而,这种潜力可能只有在特定的生物地理背景下才能实现,例如,当洄游鱼类进入生态系统时,现存物种的竞争有限。
{"title":"Phylogenomics, Lineage Diversification Rates, and the Evolution of Diadromy in Clupeiformes (Anchovies, Herrings, Sardines, and Relatives)","authors":"Joshua P Egan, Andrew M Simons, Mohammad Sadegh Alavi-Yeganeh, Michael P Hammer, Prasert Tongnunui, Dahiana Arcila, Ricardo Betancur-R, Devin D Bloom","doi":"10.1093/sysbio/syae022","DOIUrl":"https://doi.org/10.1093/sysbio/syae022","url":null,"abstract":"Migration independently evolved numerous times in animals, with a myriad of ecological and evolutionary implications. In fishes, perhaps the most extreme form of migration is diadromy, the migration between marine and freshwater environments. Key and longstanding questions are: how many times has diadromy evolved in fishes, how frequently do diadromous clades give rise to non-diadromous species, and does diadromy influence lineage diversification rates? Many diadromous fishes have large geographic ranges with constituent populations that use isolated freshwater habitats. This may limit gene flow among some populations, increasing the likelihood of speciation in diadromous lineages relative to non-diadromous lineages. Alternatively, diadromy may reduce lineage diversification rates if migration is associated with enhanced dispersal capacity that facilitates gene flow within and between populations. Clupeiformes (herrings, sardines, shads and anchovies) is a model clade for testing hypotheses about the evolution of diadromy because it includes an exceptionally high proportion of diadromous species and several independent evolutionary origins of diadromy. However, relationships among major clupeiform lineages remain unresolved and existing phylogenies sparsely sampled diadromous species, limiting the resolution of phylogenetically-informed statistical analyses. We assembled a phylogenomic dataset and used multi-species coalescent and concatenation-based approaches to generate the most comprehensive, highly-resolved clupeiform phylogeny to date, clarifying associations among several major clades and identifying recalcitrant relationships needing further examination. We determined that variation in rates of sequence evolution (heterotachy) and base-composition (non-stationarity) had little impact on our results. Using this phylogeny, we characterized evolutionary patterns of diadromy and tested for differences in lineage diversification rates between diadromous, marine, and freshwater lineages. We identified thirteen transitions to diadromy, all during the Cenozoic Era (ten origins of anadromy, two origins of catadromy, and one origin of amphidromy), and seven losses of diadromy. Two diadromous lineages rapidly generated non-diadromous species, demonstrating that diadromy is not an evolutionary dead-end. We discovered considerably faster transition rates out of diadromy than to diadromy. The largest lineage diversification rate increase in Clupeiformes was associated with a transition to diadromy, but we uncovered little statistical support for categorically faster lineage diversification rates in diadromous versus non-diadromous fishes. We propose that diadromy may increase the potential for accelerated lineage diversification, particularly in species that migrate long distances. However, this potential may only be realized in certain biogeographic contexts, such as when diadromy allows access to ecosystems in which there is limited competition from ","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140954248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benefits and Limits of Phasing Alleles for Network Inference of Allopolyploid Complexes. 异源多倍体复合体网络推断中分阶段等位基因的优势与局限性
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-11 DOI: 10.1093/sysbio/syae024
George P Tiley, Andrew A Crowl, Paul S Manos, Emily B Sessa, Claudia Solís-Lemus, Anne D Yoder, J Gordon Burleigh

Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared to haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared to using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical non-identifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.

准确重建多倍体的网状历史仍然是了解植物进化的核心挑战。虽然系统发育网络可以让人们深入了解多倍体系之间的关系,但推断网络可能会受到多倍体类群同源性测定复杂性的阻碍。我们通过模拟实验表明,与单倍型共识序列或以模糊代码表示杂合碱基的序列相比,从异源多倍体个体中分期等位基因可以用较少的位点获得真正的网络,从而改善多物种聚合下的系统发生网络推断。分阶段等位基因数据还能改善网络的分歧时间估计,这有助于评估全多倍体物种形成假说和提出物种形成机制。为了在实证数据中取得这些成果,我们提出了一种新的方法,利用最近开发的相位算法对来自多倍体的等位基因进行可靠的相位分析。该管道尤其适用于目标富集数据,因为目标富集数据的覆盖深度通常很高,足以对整个基因座进行分期。我们提供了一个北美蕨类植物干蕨复合体的经验实例,展示了分阶段数据的启示以及网络推断所面临的挑战。我们发现,我们的管道(PATÉ:从目标富集数据中分期等位基因)能够从二倍体和多倍体中恢复很高比例的分期基因座。与使用单倍型共识组装相比,这些数据可以通过准确推断基因流的方向来改进网络估计,但系统发生网络的统计不可识别性对推断网状复合体的进化历史构成了障碍。
{"title":"Benefits and Limits of Phasing Alleles for Network Inference of Allopolyploid Complexes.","authors":"George P Tiley, Andrew A Crowl, Paul S Manos, Emily B Sessa, Claudia Solís-Lemus, Anne D Yoder, J Gordon Burleigh","doi":"10.1093/sysbio/syae024","DOIUrl":"https://doi.org/10.1093/sysbio/syae024","url":null,"abstract":"<p><p>Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared to haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared to using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical non-identifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140908806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylogenomic discordance is driven by wide-spread introgression and incomplete lineage sorting during rapid species diversification within rattlesnakes (Viperidae: Crotalus and Sistrurus) 在响尾蛇(蝰科:Crotalus 和 Sistrurus)物种快速多样化的过程中,大范围的引入和不完全的世系分类导致了系统发生不一致
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-05-02 DOI: 10.1093/sysbio/syae018
Edward A Myers, Rhett M Rautsaw, Miguel Borja, Jason Jones, Christoph I Grünwald, Matthew L Holding, Felipe Grazziotin, Christopher L Parkinson
Phylogenomics allows us to uncover the historical signal of evolutionary processes through time and estimate phylogenetic networks accounting for these signals. Insight from genome-wide data further allows us to pinpoint the contributions to phylogenetic signal from hybridization, introgression, and ancestral polymorphism across the genome. Here we focus on how these processes have contributed to phylogenetic discordance among rattlesnakes (genera Crotalus and Sistrurus), a group for which there are numerous conflicting phylogenetic hypotheses based on a diverse array of molecular datasets and analytical methods. We address the instability of the rattlesnake phylogeny using genomic data generated from transcriptomes sampled from nearly all known species. These genomic data, analyzed with coalescent and network-based approaches, reveal numerous instances of rapid speciation where individual gene trees conflict with the species tree. Moreover, the evolutionary history of rattlesnakes is dominated by incomplete speciation and frequent hybridization, both of which have likely influenced past interpretations of phylogeny. We present a new framework in which the evolutionary relationships of this group can only be understood in light of genome-wide data and network-based analytical methods. Our data suggest that network radiations, like seen within the rattlesnakes, can only be understood in a phylogenomic context, necessitating similar approaches in our attempts to understand evolutionary history in other rapidly radiating species.
通过系统发生组学,我们可以发现进化过程的历史信号,并估算出这些信号的系统发生网络。从全基因组数据中获得的洞察力使我们能够进一步确定整个基因组中杂交、引入和祖先多态性对系统发育信号的贡献。在这里,我们将重点研究这些过程是如何导致响尾蛇(属 Crotalus 和 Sistrurus)之间系统发育不一致的,对于这个类群,基于不同的分子数据集和分析方法有许多相互矛盾的系统发育假说。我们利用从几乎所有已知物种的转录组中提取的基因组数据,解决了响尾蛇系统发育不稳定的问题。这些基因组数据通过聚合和基于网络的方法进行分析,揭示了个体基因树与物种树相冲突的快速物种分化的大量实例。此外,响尾蛇的进化史主要是不完全的物种分化和频繁的杂交,这两种情况都可能影响了过去对系统发生的解释。我们提出了一个新的框架,在这个框架中,只有根据全基因组数据和基于网络的分析方法,才能理解这个类群的进化关系。我们的数据表明,只有在系统发生学的背景下才能理解响尾蛇的网络辐射,因此我们在试图理解其他快速辐射物种的进化史时也有必要采用类似的方法。
{"title":"Phylogenomic discordance is driven by wide-spread introgression and incomplete lineage sorting during rapid species diversification within rattlesnakes (Viperidae: Crotalus and Sistrurus)","authors":"Edward A Myers, Rhett M Rautsaw, Miguel Borja, Jason Jones, Christoph I Grünwald, Matthew L Holding, Felipe Grazziotin, Christopher L Parkinson","doi":"10.1093/sysbio/syae018","DOIUrl":"https://doi.org/10.1093/sysbio/syae018","url":null,"abstract":"Phylogenomics allows us to uncover the historical signal of evolutionary processes through time and estimate phylogenetic networks accounting for these signals. Insight from genome-wide data further allows us to pinpoint the contributions to phylogenetic signal from hybridization, introgression, and ancestral polymorphism across the genome. Here we focus on how these processes have contributed to phylogenetic discordance among rattlesnakes (genera Crotalus and Sistrurus), a group for which there are numerous conflicting phylogenetic hypotheses based on a diverse array of molecular datasets and analytical methods. We address the instability of the rattlesnake phylogeny using genomic data generated from transcriptomes sampled from nearly all known species. These genomic data, analyzed with coalescent and network-based approaches, reveal numerous instances of rapid speciation where individual gene trees conflict with the species tree. Moreover, the evolutionary history of rattlesnakes is dominated by incomplete speciation and frequent hybridization, both of which have likely influenced past interpretations of phylogeny. We present a new framework in which the evolutionary relationships of this group can only be understood in light of genome-wide data and network-based analytical methods. Our data suggest that network radiations, like seen within the rattlesnakes, can only be understood in a phylogenomic context, necessitating similar approaches in our attempts to understand evolutionary history in other rapidly radiating species.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140821053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complex Polyploids: Origins, Genomic Composition, and Role of Introgressed Alleles 复杂的多倍体:起源、基因组组成和外来等位基因的作用
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-04-13 DOI: 10.1093/sysbio/syae012
J Luis Leal, Pascal Milesi, Eva Hodková, Qiujie Zhou, Jennifer James, D Magnus Eklund, Tanja Pyhäjärvi, Jarkko Salojärvi, Martin Lascoux
Introgression allows polyploid species to acquire new genomic content from diploid progenitors or from other unrelated diploid or polyploid lineages, contributing to genetic diversity and facilitating adaptive allele discovery. In some cases, high levels of introgression elicit the replacement of large numbers of alleles inherited from the polyploid’s ancestral species, profoundly reshaping the polyploid’s genomic composition. In such complex polyploids, it is often difficult to determine which taxa were the progenitor species and which taxa provided additional introgressive blocks through subsequent hybridization. Here, we use population-level genomic data to reconstruct the phylogenetic history of Betula pubescens (downy birch), a tetraploid species often assumed to be of allopolyploid origin and which is known to hybridize with at least four other birch species. This was achieved by modeling polyploidization and introgression events under the multispecies coalescent and then using an approximate Bayesian computation rejection algorithm to evaluate and compare competing polyploidization models. We provide evidence that B. pubescens is the outcome of an autoploid genome doubling event in the common ancestor of B. pendula and its extant sister species, B. platyphylla, that took place approximately 178,000–188,000 generations ago. Extensive hybridization with B. pendula, B. nana, and B. humilis followed in the aftermath of autopolyploidization, with the relative contribution of each of these species to the B. pubescens genome varying markedly across the species’ range. Functional analysis of B. pubescens loci containing alleles introgressed from B. nana identified multiple genes involved in climate adaptation, while loci containing alleles derived from B. humilis revealed several genes involved in the regulation of meiotic stability and pollen viability in plant species.
外来入侵使多倍体物种能够从二倍体祖先或其他不相关的二倍体或多倍体品系中获得新的基因组内容,从而促进遗传多样性,促进适应性等位基因的发现。在某些情况下,高水平的引种会导致从多倍体祖先物种继承的大量等位基因被替换,从而深刻改变多倍体的基因组组成。在这种复杂的多倍体中,通常很难确定哪些类群是祖先物种,哪些类群通过随后的杂交提供了额外的导入块。在本文中,我们利用种群级基因组数据重建了桦树(Betula pubescens)的系统发育历史,桦树是一个四倍体物种,通常被假定为起源于全多倍体,已知至少与其他四个桦树物种杂交。为此,我们在多物种凝聚下建立了多倍体化和引种事件模型,然后使用近似贝叶斯计算剔除算法来评估和比较相互竞争的多倍体化模型。我们提供的证据表明,B. pubescens 是 B. pendula 及其现生姊妹种 B. platyphylla 的共同祖先在大约 178,000-188,000 代前发生的自倍基因组加倍事件的结果。在自多倍体化之后,B. pubescens与B. pendula、B. nana和B. humilis发生了广泛杂交,这些物种对B. pubescens基因组的相对贡献在整个物种分布区有明显差异。对含有从B. nana导入的等位基因的B. pubescens基因座进行的功能分析发现了多个参与气候适应的基因,而含有从B. humilis导入的等位基因的基因座则发现了多个参与调节植物物种减数分裂稳定性和花粉活力的基因。
{"title":"Complex Polyploids: Origins, Genomic Composition, and Role of Introgressed Alleles","authors":"J Luis Leal, Pascal Milesi, Eva Hodková, Qiujie Zhou, Jennifer James, D Magnus Eklund, Tanja Pyhäjärvi, Jarkko Salojärvi, Martin Lascoux","doi":"10.1093/sysbio/syae012","DOIUrl":"https://doi.org/10.1093/sysbio/syae012","url":null,"abstract":"Introgression allows polyploid species to acquire new genomic content from diploid progenitors or from other unrelated diploid or polyploid lineages, contributing to genetic diversity and facilitating adaptive allele discovery. In some cases, high levels of introgression elicit the replacement of large numbers of alleles inherited from the polyploid’s ancestral species, profoundly reshaping the polyploid’s genomic composition. In such complex polyploids, it is often difficult to determine which taxa were the progenitor species and which taxa provided additional introgressive blocks through subsequent hybridization. Here, we use population-level genomic data to reconstruct the phylogenetic history of Betula pubescens (downy birch), a tetraploid species often assumed to be of allopolyploid origin and which is known to hybridize with at least four other birch species. This was achieved by modeling polyploidization and introgression events under the multispecies coalescent and then using an approximate Bayesian computation rejection algorithm to evaluate and compare competing polyploidization models. We provide evidence that B. pubescens is the outcome of an autoploid genome doubling event in the common ancestor of B. pendula and its extant sister species, B. platyphylla, that took place approximately 178,000–188,000 generations ago. Extensive hybridization with B. pendula, B. nana, and B. humilis followed in the aftermath of autopolyploidization, with the relative contribution of each of these species to the B. pubescens genome varying markedly across the species’ range. Functional analysis of B. pubescens loci containing alleles introgressed from B. nana identified multiple genes involved in climate adaptation, while loci containing alleles derived from B. humilis revealed several genes involved in the regulation of meiotic stability and pollen viability in plant species.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140607746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Museum genomics reveals the hybrid origin of an extinct crater lake endemic 博物馆基因组学揭示了一种已灭绝的火山口湖特有物种的杂交起源
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-04-10 DOI: 10.1093/sysbio/syae017
Amy R Tims, Peter J Unmack, Michael P Hammer, Culum Brown, Mark Adams, Matthew D McGee
Crater lake fishes are common evolutionary model systems, with recent studies suggesting a key role for gene flow in promoting rapid adaptation and speciation. However, the study of these young lakes can be complicated by human-mediated extinctions. Museum genomics approaches integrating genetic data from recently extinct species are therefore critical to understanding the complex evolutionary histories of these fragile systems. Here, we examine the evolutionary history of an extinct Southern Hemisphere crater lake endemic, the rainbowfish Melanotaenia eachamensis. We undertook comprehensive sampling of extant rainbowfish populations of the Atherton Tablelands of Australia alongside historical museum material to understand the evolutionary origins of the extinct crater lake population and the dynamics of gene flow across the ecoregion. The extinct crater lake species is genetically distinct from all other nearby populations due to historic introgression between two proximate riverine lineages, similar to other prominent crater lake speciation systems, but this historic gene flow has not been sufficient to induce a species flock. Our results suggest that museum genomics approaches can be successfully combined with extant sampling to unravel complex speciation dynamics involving recently extinct species.
火山口湖鱼类是常见的进化模式系统,最近的研究表明,基因流在促进快速适应和物种分化方面发挥着关键作用。然而,对这些年轻湖泊的研究可能会因人类造成的物种灭绝而变得复杂。因此,整合近期灭绝物种遗传数据的博物馆基因组学方法对于了解这些脆弱系统的复杂进化历史至关重要。在这里,我们研究了已灭绝的南半球火山口湖特有物种--彩虹鱼(Melanotaenia eachamensis)的进化史。我们对澳大利亚阿瑟顿高原(Atherton Tablelands)现存的彩虹鱼种群以及博物馆的历史资料进行了全面取样,以了解已灭绝的火山口湖种群的进化起源以及跨生态区的基因流动动态。已灭绝的火山口湖物种在基因上有别于附近的所有其他种群,这是由于两个相近的河系之间的历史性引种造成的,这与其他著名的火山口湖物种系统类似,但这种历史性基因流动还不足以导致物种群的形成。我们的研究结果表明,博物馆基因组学方法可以成功地与现存取样相结合,以揭示涉及近期灭绝物种的复杂物种演化动态。
{"title":"Museum genomics reveals the hybrid origin of an extinct crater lake endemic","authors":"Amy R Tims, Peter J Unmack, Michael P Hammer, Culum Brown, Mark Adams, Matthew D McGee","doi":"10.1093/sysbio/syae017","DOIUrl":"https://doi.org/10.1093/sysbio/syae017","url":null,"abstract":"Crater lake fishes are common evolutionary model systems, with recent studies suggesting a key role for gene flow in promoting rapid adaptation and speciation. However, the study of these young lakes can be complicated by human-mediated extinctions. Museum genomics approaches integrating genetic data from recently extinct species are therefore critical to understanding the complex evolutionary histories of these fragile systems. Here, we examine the evolutionary history of an extinct Southern Hemisphere crater lake endemic, the rainbowfish Melanotaenia eachamensis. We undertook comprehensive sampling of extant rainbowfish populations of the Atherton Tablelands of Australia alongside historical museum material to understand the evolutionary origins of the extinct crater lake population and the dynamics of gene flow across the ecoregion. The extinct crater lake species is genetically distinct from all other nearby populations due to historic introgression between two proximate riverine lineages, similar to other prominent crater lake speciation systems, but this historic gene flow has not been sufficient to induce a species flock. Our results suggest that museum genomics approaches can be successfully combined with extant sampling to unravel complex speciation dynamics involving recently extinct species.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140544951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Museum skins enable identification of introgression associated with cytonuclear discordance 通过博物馆的皮肤可以识别与细胞核不一致有关的引种
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-04-05 DOI: 10.1093/sysbio/syae016
Sally Potter, Craig Moritz, Maxine P Piggott, Jason G Bragg, Ana C Afonso Silva, Ke Bi, Christiana McDonald-Spicer, Rustamzhon Turakulov, Mark D B Eldridge
Increased sampling of genomes and populations across closely related species has revealed that levels of genetic exchange during and after speciation are higher than previously thought. One obvious manifestation of such exchange is strong cytonuclear discordance, where the divergence in mitochondrial DNA (mtDNA) differs from that for nuclear genes more (or less) than expected from differences between mtDNA and nuclear DNA (nDNA) in population size and mutation rate. Given genome-scale datasets and coalescent modelling, we can now confidently identify cases of strong discordance and test specifically for historical or recent introgression as the cause. Using population sampling, combining exon capture data from historical museum specimens and recently collected tissues we showcase how genomic tools can resolve complex evolutionary histories in the brachyotis group of rock-wallabies (Petrogale). In particular, applying population and phylogenomic approaches we can assess the role of demographic processes in driving complex evolutionary patterns and assess a role of ancient introgression and hybridisation. We find that described species are well supported as monophyletic taxa for nDNA genes, but not for mtDNA, with cytonuclear discordance involving at least four operational taxonomic units (OTUs) across four species which diverged 183-278 kya. ABC modelling of nDNA gene trees supports introgression during or after speciation for some taxon pairs with cytonuclear discordance. Given substantial differences in body size between the species involved, this evidence for gene flow is surprising. Heterogenous patterns of introgression were identified but do not appear to be associated with chromosome differences between species. These and previous results suggest that dynamic past climates across the monsoonal tropics could have promoted reticulation among related species.
对密切相关物种的基因组和种群进行更多采样后发现,物种分化期间和分化后的基因交换水平比以前想象的要高。线粒体 DNA(mtDNA)与核基因之间的差异,比线粒体 DNA 与核 DNA(nDNA)在种群规模和突变率上的差异所预期的要大(或小)。有了基因组尺度的数据集和聚合模型,我们现在可以有把握地识别强烈不一致的情况,并具体检验历史或近期的引入是否是其原因。利用种群采样,结合从博物馆历史标本和最近采集的组织中获得的外显子捕获数据,我们展示了基因组工具如何解决岩袋鼠(Petrogale)brachyotis 群复杂的进化历史问题。特别是,通过应用种群和系统发生学方法,我们可以评估人口统计过程在推动复杂进化模式中的作用,并评估古老引种和杂交的作用。我们发现,所描述的物种在 nDNA 基因上作为单系类群得到了很好的支持,但在 mtDNA 上却没有,细胞核不一致涉及到 183-278 千年分化的四个物种中的至少四个操作分类单元(OTU)。nDNA基因树的ABC建模支持一些具有细胞核不一致性的类群对在物种分化过程中或分化后的引入。考虑到相关物种之间体型的巨大差异,这一基因流证据令人惊讶。研究还发现了不同的引种模式,但这些模式似乎与物种间的染色体差异无关。这些结果和之前的结果表明,季风热带地区过去的动态气候可能促进了相关物种之间的网状分布。
{"title":"Museum skins enable identification of introgression associated with cytonuclear discordance","authors":"Sally Potter, Craig Moritz, Maxine P Piggott, Jason G Bragg, Ana C Afonso Silva, Ke Bi, Christiana McDonald-Spicer, Rustamzhon Turakulov, Mark D B Eldridge","doi":"10.1093/sysbio/syae016","DOIUrl":"https://doi.org/10.1093/sysbio/syae016","url":null,"abstract":"Increased sampling of genomes and populations across closely related species has revealed that levels of genetic exchange during and after speciation are higher than previously thought. One obvious manifestation of such exchange is strong cytonuclear discordance, where the divergence in mitochondrial DNA (mtDNA) differs from that for nuclear genes more (or less) than expected from differences between mtDNA and nuclear DNA (nDNA) in population size and mutation rate. Given genome-scale datasets and coalescent modelling, we can now confidently identify cases of strong discordance and test specifically for historical or recent introgression as the cause. Using population sampling, combining exon capture data from historical museum specimens and recently collected tissues we showcase how genomic tools can resolve complex evolutionary histories in the brachyotis group of rock-wallabies (Petrogale). In particular, applying population and phylogenomic approaches we can assess the role of demographic processes in driving complex evolutionary patterns and assess a role of ancient introgression and hybridisation. We find that described species are well supported as monophyletic taxa for nDNA genes, but not for mtDNA, with cytonuclear discordance involving at least four operational taxonomic units (OTUs) across four species which diverged 183-278 kya. ABC modelling of nDNA gene trees supports introgression during or after speciation for some taxon pairs with cytonuclear discordance. Given substantial differences in body size between the species involved, this evidence for gene flow is surprising. Heterogenous patterns of introgression were identified but do not appear to be associated with chromosome differences between species. These and previous results suggest that dynamic past climates across the monsoonal tropics could have promoted reticulation among related species.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140352043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene Flow and Isolation in the Arid Nearctic Revealed by Genomic Analyses of Desert Spiny Lizards 沙漠棘蜥的基因组分析揭示了干旱近北极地区的基因流动和隔离现象
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2024-01-05 DOI: 10.1093/sysbio/syae001
Carlos J Pavón-Vázquez, Qaantah Rana, Keaka Farleigh, Erika Crispo, Mimi Zeng, Jeevanie Liliah, Daniel Mulcahy, Alfredo Ascanio, Tereza Jezkova, Adam D Leaché, Tomas Flouri, Ziheng Yang, Christopher Blair
The opposing forces of gene flow and isolation are two major processes shaping genetic diversity. Understanding how these vary across space and time is necessary to identify the environmental features that promote diversification. The detection of considerable geographic structure in taxa from the arid Nearctic has prompted research into the drivers of isolation in the region. Several geographic features have been proposed as barriers to gene flow, including the Colorado River, Western Continental Divide, and a hypothetical Mid-Peninsular Seaway in Baja California. However, recent studies suggest that the role of barriers in genetic differentiation may have been overestimated when compared to other mechanisms of divergence. In this study, we infer historical and spatial patterns of connectivity and isolation in Desert Spiny Lizards (Sceloporus magister) and Baja Spiny Lizards (S. zosteromus), which together form a species complex composed of parapatric lineages with wide distributions in arid western North America. Our analyses incorporate mitochondrial sequences, genomic-scale data, and past and present climatic data to evaluate the nature and strength of barriers to gene flow in the region. Our approach relies on estimates of migration under the multispecies coalescent to understand the history of lineage divergence in the face of gene flow. Results show that the S. magister complex is geographically structured, but we also detect instances of gene flow. The Continental Divide is a strong barrier to gene flow, while the Colorado River is more permeable. Analyses yield conflicting results for the catalyst of differentiation of peninsular lineages in S. zosteromus. Our study shows how large-scale genomic data for thoroughly sampled species can shed new light on biogeography. Furthermore, our approach highlights the need for the combined analysis of multiple sources of evidence to adequately characterize the drivers of divergence.
基因流动和基因隔离是形成遗传多样性的两个主要过程。要确定促进多样化的环境特征,就必须了解这两种力量在不同时空的变化情况。在近北极干旱地区的分类群中发现了相当大的地理结构,这促使人们对该地区隔离的驱动因素进行研究。一些地理特征被认为是基因流动的障碍,包括科罗拉多河、西部大陆分水岭和下加利福尼亚州假设的中半岛海道。然而,最近的研究表明,与其他分化机制相比,障碍在基因分化中的作用可能被高估了。在这项研究中,我们推断了沙漠棘蜥和下加利福尼亚棘蜥的历史和空间连通性与隔离模式,它们共同组成了一个物种复合体,由分布在北美洲西部干旱地区的准同源种系组成。我们的分析结合了线粒体序列、基因组规模数据以及过去和现在的气候数据,以评估该地区基因流动障碍的性质和强度。我们的方法依赖于多物种聚合下的迁移估算,以了解面对基因流动时物种分化的历史。结果表明,S. magister复合体具有地理结构,但我们也发现了基因流动的实例。大陆分水岭是基因流动的强大障碍,而科罗拉多河则更具渗透性。分析结果显示,带状孢子虫半岛系的分化催化剂相互矛盾。我们的研究表明,对物种进行全面采样的大规模基因组数据可以为生物地理学带来新的启示。此外,我们的方法还强调了综合分析多种证据来源的必要性,以充分描述分化的驱动因素。
{"title":"Gene Flow and Isolation in the Arid Nearctic Revealed by Genomic Analyses of Desert Spiny Lizards","authors":"Carlos J Pavón-Vázquez, Qaantah Rana, Keaka Farleigh, Erika Crispo, Mimi Zeng, Jeevanie Liliah, Daniel Mulcahy, Alfredo Ascanio, Tereza Jezkova, Adam D Leaché, Tomas Flouri, Ziheng Yang, Christopher Blair","doi":"10.1093/sysbio/syae001","DOIUrl":"https://doi.org/10.1093/sysbio/syae001","url":null,"abstract":"The opposing forces of gene flow and isolation are two major processes shaping genetic diversity. Understanding how these vary across space and time is necessary to identify the environmental features that promote diversification. The detection of considerable geographic structure in taxa from the arid Nearctic has prompted research into the drivers of isolation in the region. Several geographic features have been proposed as barriers to gene flow, including the Colorado River, Western Continental Divide, and a hypothetical Mid-Peninsular Seaway in Baja California. However, recent studies suggest that the role of barriers in genetic differentiation may have been overestimated when compared to other mechanisms of divergence. In this study, we infer historical and spatial patterns of connectivity and isolation in Desert Spiny Lizards (Sceloporus magister) and Baja Spiny Lizards (S. zosteromus), which together form a species complex composed of parapatric lineages with wide distributions in arid western North America. Our analyses incorporate mitochondrial sequences, genomic-scale data, and past and present climatic data to evaluate the nature and strength of barriers to gene flow in the region. Our approach relies on estimates of migration under the multispecies coalescent to understand the history of lineage divergence in the face of gene flow. Results show that the S. magister complex is geographically structured, but we also detect instances of gene flow. The Continental Divide is a strong barrier to gene flow, while the Colorado River is more permeable. Analyses yield conflicting results for the catalyst of differentiation of peninsular lineages in S. zosteromus. Our study shows how large-scale genomic data for thoroughly sampled species can shed new light on biogeography. Furthermore, our approach highlights the need for the combined analysis of multiple sources of evidence to adequately characterize the drivers of divergence.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139400471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Accuracy of Methods for Detecting Correlated Rates of Molecular and Morphological Evolution. 评估分子和形态进化相关速率检测方法的准确性。
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2023-12-30 DOI: 10.1093/sysbio/syad055
Yasmin Asar, Hervé Sauquet, Simon Y W Ho

Determining the link between genomic and phenotypic change is a fundamental goal in evolutionary biology. Insights into this link can be gained by using a phylogenetic approach to test for correlations between rates of molecular and morphological evolution. However, there has been persistent uncertainty about the relationship between these rates, partly because conflicting results have been obtained using various methods that have not been examined in detail. We carried out a simulation study to evaluate the performance of 5 statistical methods for detecting correlated rates of evolution. Our simulations explored the evolution of molecular sequences and morphological characters under a range of conditions. Of the methods tested, Bayesian relaxed-clock estimation of branch rates was able to detect correlated rates of evolution correctly in the largest number of cases. This was followed by correlations of root-to-tip distances, Bayesian model selection, independent sister-pairs contrasts, and likelihood-based model selection. As expected, the power to detect correlated rates increased with the amount of data, both in terms of tree size and number of morphological characters. Likewise, greater among-lineage rate variation in the data led to improved performance of all 5 methods, particularly for Bayesian relaxed-clock analysis when the rate model was mismatched. We then applied these methods to a data set from flowering plants and did not find evidence of a correlation in evolutionary rates between genomic data and morphological characters. The results of our study have practical implications for phylogenetic analyses of combined molecular and morphological data sets, and highlight the conditions under which the links between genomic and phenotypic rates of evolution can be evaluated quantitatively.

确定基因组和表型变化之间的联系是进化生物学的一个基本目标。利用系统发生学方法检验分子进化速度与形态进化速度之间的相关性,可以深入了解这种联系。然而,这些进化率之间的关系一直存在不确定性,部分原因是使用各种方法得出的结果相互矛盾,而这些方法尚未经过详细研究。我们进行了一项模拟研究,以评估 5 种检测相关进化速率的统计方法的性能。我们的模拟探索了分子序列和形态特征在一系列条件下的进化。在所测试的方法中,贝叶斯松弛时钟分支率估计法能够在最多的情况下正确检测出相关进化率。其次是根尖距离相关性、贝叶斯模型选择、独立姐妹对对比和基于似然法的模型选择。不出所料,无论是从树的大小还是形态特征的数量来看,检测相关率的能力都随着数据量的增加而提高。同样,数据中更大的世系间速率差异也会提高所有 5 种方法的性能,尤其是在速率模型不匹配的贝叶斯松弛时钟分析中。然后,我们将这些方法应用于开花植物的数据集,结果没有发现基因组数据与形态特征之间存在进化速率相关性的证据。我们的研究结果对结合分子和形态学数据集进行系统进化分析具有实际意义,并强调了可以定量评估基因组和表型进化率之间联系的条件。
{"title":"Evaluating the Accuracy of Methods for Detecting Correlated Rates of Molecular and Morphological Evolution.","authors":"Yasmin Asar, Hervé Sauquet, Simon Y W Ho","doi":"10.1093/sysbio/syad055","DOIUrl":"10.1093/sysbio/syad055","url":null,"abstract":"<p><p>Determining the link between genomic and phenotypic change is a fundamental goal in evolutionary biology. Insights into this link can be gained by using a phylogenetic approach to test for correlations between rates of molecular and morphological evolution. However, there has been persistent uncertainty about the relationship between these rates, partly because conflicting results have been obtained using various methods that have not been examined in detail. We carried out a simulation study to evaluate the performance of 5 statistical methods for detecting correlated rates of evolution. Our simulations explored the evolution of molecular sequences and morphological characters under a range of conditions. Of the methods tested, Bayesian relaxed-clock estimation of branch rates was able to detect correlated rates of evolution correctly in the largest number of cases. This was followed by correlations of root-to-tip distances, Bayesian model selection, independent sister-pairs contrasts, and likelihood-based model selection. As expected, the power to detect correlated rates increased with the amount of data, both in terms of tree size and number of morphological characters. Likewise, greater among-lineage rate variation in the data led to improved performance of all 5 methods, particularly for Bayesian relaxed-clock analysis when the rate model was mismatched. We then applied these methods to a data set from flowering plants and did not find evidence of a correlation in evolutionary rates between genomic data and morphological characters. The results of our study have practical implications for phylogenetic analyses of combined molecular and morphological data sets, and highlight the conditions under which the links between genomic and phenotypic rates of evolution can be evaluated quantitatively.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10924723/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10554842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Summary Tests of Introgression Are Highly Sensitive to Rate Variation Across Lineages. 总结性回归测试对不同品系间的比率变异非常敏感。
IF 6.5 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2023-12-30 DOI: 10.1093/sysbio/syad056
Lauren E Frankel, Cécile Ané

The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods are based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA-BABA test), the D3 test, and HyDe. All 3 tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and rate variation across gene-tree edges. We simulated species networks according to a birth-death-hybridization process, so as to capture a range of realistic species phylogenies. For all 3 methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to more sensitive to a departure from the clock than to the presence of reticulation. For all 3 tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics, and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.

人们越来越认识到杂交和引入对整个生命树的进化意义和频率。为了从多焦点和全基因组序列数据中检测杂交,一类流行的方法是基于 3 或 4 个分类群子集的汇总统计。然而,这些方法通常带有跨品系和跨基因的恒定替换率假设,而这一假设在许多类群中普遍被违反。在这项工作中,我们量化了比率变化对 D 检验(也称 ABBA-BABA 检验)、D3 检验和 HyDe 的影响。所有这三种检验都被广泛用于各种分类群,部分原因是它们的计算速度非常快。我们考虑了物种系间、基因间、物种系与基因间相互作用的速率变化,以及基因树边缘的速率变化。我们按照 "出生-死亡-杂交 "过程模拟物种网络,以捕捉一系列现实的物种系统发育。对于所测试的所有 3 种方法,我们发现当不同物种系之间存在速率变化时,网状结构的错误发现率(1 类错误率)会明显增加。D3 检验最敏感,类型-1 错误率约为 80%,因此 D3 似乎对偏离时钟比对网状结构的存在更敏感。在所有 3 个检验中,随着杂交事件数量的增加,检测到杂交事件的能力下降,这表明如果多个杂交事件发生在一小部分类群中,它们可能会相互掩盖。我们的研究强调了在使用基于位点的汇总统计时考虑速率变异的必要性,并指出了无需假设跨系或跨基因进化速率的方法的优势。
{"title":"Summary Tests of Introgression Are Highly Sensitive to Rate Variation Across Lineages.","authors":"Lauren E Frankel, Cécile Ané","doi":"10.1093/sysbio/syad056","DOIUrl":"10.1093/sysbio/syad056","url":null,"abstract":"<p><p>The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods are based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA-BABA test), the D3 test, and HyDe. All 3 tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and rate variation across gene-tree edges. We simulated species networks according to a birth-death-hybridization process, so as to capture a range of realistic species phylogenies. For all 3 methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to more sensitive to a departure from the clock than to the presence of reticulation. For all 3 tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics, and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10214455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Systematic Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1