首页 > 最新文献

Genome Biology and Evolution最新文献

英文 中文
Molecular and Functional Divergence of Zebrafish Sox Paralogs Controlling Endoderm Formation and Left-Right Patterning. 斑马鱼内胚层形成和左右模式的Sox同源物的分子和功能分化。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf213
Simaran Johal, Randa Elsayed, Dongfeng Wang, Conor D Talbot, Roberto Feuda, Kristen A Panfilio, Andrew C Nelson

Endoderm, one of three primary germ layers of vertebrate embryos, makes major contributions to the respiratory and gastrointestinal tracts and associated organs, including the liver and pancreas. In mammals, transcription factor (TF) SOX17 is vital for endoderm organ formation and can induce endoderm progenitor identity. Duplication of ancestral sox17 before or during the early evolution of ray-finned fishes produced paralogs sox32 and sox17 in zebrafish. Sox32 is required for specification of endoderm and progenitors of the left-right (LR) organizer (Kupffer's Vesicle, KV), with Sox17 a downstream target of Sox32 implicated in further KV development. Phenotypic evidence, therefore, suggests functional similarities between zebrafish Sox32 and Sox17 and mammalian SOX17. Here, we directly compare these orthologs and paralogs, using the early zebrafish embryo as a biological platform for functional testing. Our results indicate that, unlike Sox32, human SOX17 cannot induce endoderm specification in zebrafish. Furthermore, using hybrid protein functional analyses, we show that Sox32 specificity for the endoderm gene regulatory network is linked to evolutionary divergence in its DNA-binding High Mobility Group domain from its paralog Sox17. Additionally, changes in the C-terminal regions of Sox32 and Sox17 underpin their differing target specificities. Finally, we establish that specific conserved peptides in the Sox17 C-terminal domain are essential for its role in establishing correct organ asymmetry. Overall, our results illuminate the molecular basis for functional divergence of Sox32 and Sox17 in vertebrate endoderm development and LR patterning, and reveal that alterations in specific domains of both TFs at different points during the evolution of fish are critical to their distinct and essential functions.

内胚层是脊椎动物胚胎的三个主要胚层之一,对呼吸道和胃肠道以及包括肝脏和胰腺在内的相关器官有重要贡献。在哺乳动物中,转录因子SOX17对内胚层器官的形成至关重要,可以诱导内胚层祖细胞的身份。在斑马鱼进化之前或早期,祖先sox17的重复产生了类似的sox32和sox17。Sox32是左右组织者(Kupffer's Vesicle, KV)的内胚层和祖细胞的规范所必需的,Sox17是Sox32的下游靶标,与KV的进一步发展有关。表型证据表明斑马鱼Sox32和Sox17与哺乳动物Sox17功能相似。在这里,我们直接比较这些同源物和旁系物,使用早期斑马鱼胚胎作为功能测试的生物学平台。我们的研究结果表明,与Sox32不同,人类SOX17不能诱导斑马鱼的内胚层分化。此外,通过杂交蛋白功能分析,我们发现Sox32对内胚层基因调控网络的特异性与其dna结合HMG结构域与旁系Sox17的进化分化有关。此外,Sox32和Sox17的c端区域的变化支撑了它们不同的靶标特异性。最后,我们确定Sox17 c端结构域中的特定保守肽对于其在建立正确的器官不对称中所起的作用至关重要。总的来说,我们的研究结果阐明了Sox32和Sox17在脊椎动物内胚层发育和左右模式中功能分化的分子基础,并揭示了在鱼类进化过程中,这两种转录因子在不同阶段的特定结构域的改变对它们的独特和基本功能至关重要。
{"title":"Molecular and Functional Divergence of Zebrafish Sox Paralogs Controlling Endoderm Formation and Left-Right Patterning.","authors":"Simaran Johal, Randa Elsayed, Dongfeng Wang, Conor D Talbot, Roberto Feuda, Kristen A Panfilio, Andrew C Nelson","doi":"10.1093/gbe/evaf213","DOIUrl":"10.1093/gbe/evaf213","url":null,"abstract":"<p><p>Endoderm, one of three primary germ layers of vertebrate embryos, makes major contributions to the respiratory and gastrointestinal tracts and associated organs, including the liver and pancreas. In mammals, transcription factor (TF) SOX17 is vital for endoderm organ formation and can induce endoderm progenitor identity. Duplication of ancestral sox17 before or during the early evolution of ray-finned fishes produced paralogs sox32 and sox17 in zebrafish. Sox32 is required for specification of endoderm and progenitors of the left-right (LR) organizer (Kupffer's Vesicle, KV), with Sox17 a downstream target of Sox32 implicated in further KV development. Phenotypic evidence, therefore, suggests functional similarities between zebrafish Sox32 and Sox17 and mammalian SOX17. Here, we directly compare these orthologs and paralogs, using the early zebrafish embryo as a biological platform for functional testing. Our results indicate that, unlike Sox32, human SOX17 cannot induce endoderm specification in zebrafish. Furthermore, using hybrid protein functional analyses, we show that Sox32 specificity for the endoderm gene regulatory network is linked to evolutionary divergence in its DNA-binding High Mobility Group domain from its paralog Sox17. Additionally, changes in the C-terminal regions of Sox32 and Sox17 underpin their differing target specificities. Finally, we establish that specific conserved peptides in the Sox17 C-terminal domain are essential for its role in establishing correct organ asymmetry. Overall, our results illuminate the molecular basis for functional divergence of Sox32 and Sox17 in vertebrate endoderm development and LR patterning, and reveal that alterations in specific domains of both TFs at different points during the evolution of fish are critical to their distinct and essential functions.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145495284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylogeny-aware Simulations Suggest a Low Impact of Unsampled Lineages in the Inference of Gene Flow During Eukaryogenesis. 系统发育意识模拟表明,在真核发生过程中,未采样谱系对基因流动的推断影响不大。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf190
Moisès Bernabeu, Saioa Manzano-Morales, Toni Gabaldón

The topologies of gene trees are broadly used to infer horizontal gene transfer events and characterize the potential donor and acceptor partners. Additionally, ratios between branch lengths in the gene tree can inform about the timing of transfers relative to each other. Using this approach, recent studies have proposed a relative chronology of gene acquisitions in the lineage leading to the last eukaryotic common ancestor. However, a recognized caveat of the branch-length ratio method are potential biases due to incomplete taxon sampling resulting in so-called "ghost" lineages. Here, we assessed the effect of ghost lineages on the inference of the relative ordering of gene acquisition events during eukaryogenesis. For this, we used a novel simulation framework that populates a dated Tree of Life with plausible "ghost" lineages and simulates their gene transfers to the lineage leading to last eukaryotic common ancestor. Our simulations suggest that a substantial majority of gene acquisitions from distinct ghost donors are inferred with the correct relative order. However, we identify phylogenetic placements where ghost lineages would be more likely to produce misleading results. Overall, our approach offers valuable guidance for the interpretation of future work on eukaryogenesis, and can be readily adapted to other evolutionary scenarios.

基因树的拓扑结构被广泛用于推断水平基因转移事件和描述潜在的供体和受体伙伴。此外,基因树中分支长度之间的比率可以告知相对于彼此的转移时间。利用这种方法,最近的研究提出了一个谱系中基因获得的相对年表,导致最后的真核共同祖先(LECA)。然而,分支长度比方法的一个公认的警告是由于不完整的分类群采样导致所谓的“幽灵”谱系的潜在偏差。在这里,我们评估了幽灵谱系对真核发生过程中基因获得事件的相对顺序推断的影响。为此,我们使用了一种新颖的模拟框架,该框架填充了一个古老的生命树,其中包含可信的“幽灵”谱系,并模拟了它们的基因转移到导致LECA的谱系。我们的模拟表明,绝大多数从不同的幽灵捐赠者获得的基因都是按照正确的相对顺序推断出来的。然而,我们确定的系统发育位置,鬼谱系将更有可能产生误导性的结果。总的来说,我们的方法为未来真核发生工作的解释提供了有价值的指导,并且可以很容易地适应其他进化情景。
{"title":"Phylogeny-aware Simulations Suggest a Low Impact of Unsampled Lineages in the Inference of Gene Flow During Eukaryogenesis.","authors":"Moisès Bernabeu, Saioa Manzano-Morales, Toni Gabaldón","doi":"10.1093/gbe/evaf190","DOIUrl":"10.1093/gbe/evaf190","url":null,"abstract":"<p><p>The topologies of gene trees are broadly used to infer horizontal gene transfer events and characterize the potential donor and acceptor partners. Additionally, ratios between branch lengths in the gene tree can inform about the timing of transfers relative to each other. Using this approach, recent studies have proposed a relative chronology of gene acquisitions in the lineage leading to the last eukaryotic common ancestor. However, a recognized caveat of the branch-length ratio method are potential biases due to incomplete taxon sampling resulting in so-called \"ghost\" lineages. Here, we assessed the effect of ghost lineages on the inference of the relative ordering of gene acquisition events during eukaryogenesis. For this, we used a novel simulation framework that populates a dated Tree of Life with plausible \"ghost\" lineages and simulates their gene transfers to the lineage leading to last eukaryotic common ancestor. Our simulations suggest that a substantial majority of gene acquisitions from distinct ghost donors are inferred with the correct relative order. However, we identify phylogenetic placements where ghost lineages would be more likely to produce misleading results. Overall, our approach offers valuable guidance for the interpretation of future work on eukaryogenesis, and can be readily adapted to other evolutionary scenarios.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12573248/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145250846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paleogenomics Reveals a Loss of Bovine Lineages in Mid-latitude Asia Over the Last 200,000 Years. 古基因组学揭示了在过去的20万年中,亚洲中纬度地区牛系的消失。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf206
Alexandre Gilardet, Jonas Oppenheimer, Mikkel-Holger S Sinding, Edana Lord, J Camilo Chacón-Duque, Gonzalo Oteo-García, Georgios Xenikoudakis, Pavel Kosintsev, John Southon, Sergey K Vasiliev, Michael V Shunkov, Maxim B Kozlikin, Katerina Douka, Beth Shapiro, Peter D Heintzman, Love Dalén

Bovines have a complex yet poorly understood evolutionary history that is characterized by admixture and diversity loss during the Late Pleistocene. Unraveling this history is challenging in part because deep-time and geographically widespread genetic data are currently limited. In mid-latitude Asia, Denisova Cave, located in the Altai, Siberia, and nearby paleontological sites have yielded a large collection of remains spanning the Middle to Late Pleistocene, many of which are identifiable as bovines via morphology or paleoproteomics. In this study, we screened these bovine bones for ancient DNA and generated mitogenomes, to refine knowledge of Pleistocene bovine diversity in the region. We found that bovines carrying a yak-like mitogenome were common residents of the Altai mountains, along with bison belonging to the clade X mitochondrial lineage and, more rarely, aurochs. The yak-like mitochondrial lineage identified in this study represents a previously unknown lineage sister to present-day yak mitogenome diversity. This yak-like mitochondrial lineage, termed yak X, was identified at several sites, and survived in mid-latitude Asia across climatic transitions for around 200,000 years. Our findings suggest that all three bovine taxa harbored diversity no longer present in extant populations, thus mirroring archaic hominin findings at Denisova Cave. The Altai mountains therefore appear to have been a hotspot of both bovine and hominin diversity.

牛的进化史复杂而鲜为人知,其特征是晚更新世的混合和多样性丧失。揭开这段历史是具有挑战性的,部分原因是深层时间和地理上广泛分布的基因数据目前有限。在亚洲中纬度地区,位于西伯利亚阿尔泰的丹尼索瓦洞穴和附近的古生物学遗址发现了大量跨越中更新世到晚更新世的遗骸,其中许多通过形态学或古蛋白质组学可以识别为牛。在这项研究中,我们筛选了这些牛骨骼的古代DNA (aDNA)并生成了有丝分裂基因组,以完善该地区更新世牛多样性的知识。我们发现,携带类似牦牛的有丝分裂基因组的牛是阿尔泰山脉的常见居民,还有属于X枝线粒体谱系的野牛,以及更罕见的野牛。在这项研究中发现的牦牛样线粒体谱系代表了一个以前未知的谱系姐妹,今天的牦牛有丝分裂基因组多样性。这种类似牦牛的线粒体谱系,被称为牦牛X,在几个地点被发现,并在亚洲中纬度地区跨越气候变化存活了大约20万年。我们的研究结果表明,这三种牛类群所具有的多样性在现存种群中已不复存在,从而反映了在丹尼索瓦洞穴中发现的古人类。因此,阿尔泰山脉似乎是牛和人类多样性的热点。
{"title":"Paleogenomics Reveals a Loss of Bovine Lineages in Mid-latitude Asia Over the Last 200,000 Years.","authors":"Alexandre Gilardet, Jonas Oppenheimer, Mikkel-Holger S Sinding, Edana Lord, J Camilo Chacón-Duque, Gonzalo Oteo-García, Georgios Xenikoudakis, Pavel Kosintsev, John Southon, Sergey K Vasiliev, Michael V Shunkov, Maxim B Kozlikin, Katerina Douka, Beth Shapiro, Peter D Heintzman, Love Dalén","doi":"10.1093/gbe/evaf206","DOIUrl":"10.1093/gbe/evaf206","url":null,"abstract":"<p><p>Bovines have a complex yet poorly understood evolutionary history that is characterized by admixture and diversity loss during the Late Pleistocene. Unraveling this history is challenging in part because deep-time and geographically widespread genetic data are currently limited. In mid-latitude Asia, Denisova Cave, located in the Altai, Siberia, and nearby paleontological sites have yielded a large collection of remains spanning the Middle to Late Pleistocene, many of which are identifiable as bovines via morphology or paleoproteomics. In this study, we screened these bovine bones for ancient DNA and generated mitogenomes, to refine knowledge of Pleistocene bovine diversity in the region. We found that bovines carrying a yak-like mitogenome were common residents of the Altai mountains, along with bison belonging to the clade X mitochondrial lineage and, more rarely, aurochs. The yak-like mitochondrial lineage identified in this study represents a previously unknown lineage sister to present-day yak mitogenome diversity. This yak-like mitochondrial lineage, termed yak X, was identified at several sites, and survived in mid-latitude Asia across climatic transitions for around 200,000 years. Our findings suggest that all three bovine taxa harbored diversity no longer present in extant populations, thus mirroring archaic hominin findings at Denisova Cave. The Altai mountains therefore appear to have been a hotspot of both bovine and hominin diversity.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12628791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145476926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Natural Selection in Transcription Factor-DNA Interaction Motifs: A Comparative and Population Genomics Perspective. 转录因子(TF)- dna相互作用基序的自然选择-比较和群体基因组学的观点。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf212
Manas Joshi, Pablo Duchen, Adamandia Kapopoulou, Stefan Laurent

Natural selection heavily influences the evolutionary trajectories of species by impacting their genotype-to-phenotype transitions. On the molecular level, these transitions are shaped by the regulatory sequences. In this study, we employed a combination of population and comparative genomics to investigate how natural selection affects specific regulatory sequence classes involved in the regulatory transcription factor-DNA interactions. These interactions consist of two motifs, namely: transcription factor-binding domains and transcription factor-binding sites. Using publicly available annotation data for Homo sapiens, Arabidopsis thaliana, and Drosophila melanogaster, we first constructed the species-specific lists of the transcription factor-binding domain regions. On applying some of the commonly used summary statistics, we found signals of purifying selection acting on transcription factor-binding domains, consistent with their functional importance. Next, using the biochemical assay-based annotations, we identified potential transcription factor-binding site regions and used variants within them as nonsynonymous equivalents. Interestingly, we also observed that noncoding transcription factor-binding site regions showed similar levels of constraint to that of coding regions for populations with large Ne. Signals of positive selection were limited. Nevertheless, McDonald-Kreitman estimates revealed that, in both fruit-fly and thale-cress, α for transcription factor-binding domains was consistently higher than for adjacent nonbinding domains, whereas no such difference was apparent in humans. Taken together, our comparative analysis shows that the efficiency of negative-and to a lesser extent positive-selection on transcription factor-DNA interface elements scales with effective population size. The dataset and analysis pipeline provide a baseline for future studies of regulatory evolution across coding and noncoding regions.

自然选择通过影响物种的基因型到表型的转变,严重影响了物种的进化轨迹。在分子水平上,这些转变是由调控序列(RSs)形成的。在这项研究中,我们采用群体基因组学和比较基因组学相结合的方法来研究自然选择如何影响参与调控转录因子(TF)-DNA相互作用的特定RS类。这些相互作用由两个基序组成,即:tf结合域(TF-BDs)和tf结合位点(TF-BSs)。利用公开的智人、拟南芥和黑腹果蝇的注释数据,我们首先构建了TF-BD区域的物种特异性列表。通过应用一些常用的汇总统计,我们发现了净化选择作用于tf - bd的信号,这与它们的功能重要性是一致的。接下来,使用基于生化分析的注释,我们确定了潜在的TF-BS区域,并使用其中的变体作为非同义等效。有趣的是,我们还观察到非编码TF-BS区域在具有大Ne的种群中表现出与编码区域相似的约束水平。积极选择的信号是有限的。然而,McDonald-Kreitman估计显示,在果蝇和海蝇中,tf - bd的α始终高于邻近的非结合结构域,而在人类中则没有明显的差异。综上所述,我们的比较分析表明,TF-DNA界面元素的负选择和在较小程度上的正选择的效率与有效种群大小有关。数据集和分析管道为未来跨编码区和非编码区调控进化的研究提供了基线。
{"title":"Natural Selection in Transcription Factor-DNA Interaction Motifs: A Comparative and Population Genomics Perspective.","authors":"Manas Joshi, Pablo Duchen, Adamandia Kapopoulou, Stefan Laurent","doi":"10.1093/gbe/evaf212","DOIUrl":"10.1093/gbe/evaf212","url":null,"abstract":"<p><p>Natural selection heavily influences the evolutionary trajectories of species by impacting their genotype-to-phenotype transitions. On the molecular level, these transitions are shaped by the regulatory sequences. In this study, we employed a combination of population and comparative genomics to investigate how natural selection affects specific regulatory sequence classes involved in the regulatory transcription factor-DNA interactions. These interactions consist of two motifs, namely: transcription factor-binding domains and transcription factor-binding sites. Using publicly available annotation data for Homo sapiens, Arabidopsis thaliana, and Drosophila melanogaster, we first constructed the species-specific lists of the transcription factor-binding domain regions. On applying some of the commonly used summary statistics, we found signals of purifying selection acting on transcription factor-binding domains, consistent with their functional importance. Next, using the biochemical assay-based annotations, we identified potential transcription factor-binding site regions and used variants within them as nonsynonymous equivalents. Interestingly, we also observed that noncoding transcription factor-binding site regions showed similar levels of constraint to that of coding regions for populations with large Ne. Signals of positive selection were limited. Nevertheless, McDonald-Kreitman estimates revealed that, in both fruit-fly and thale-cress, α for transcription factor-binding domains was consistently higher than for adjacent nonbinding domains, whereas no such difference was apparent in humans. Taken together, our comparative analysis shows that the efficiency of negative-and to a lesser extent positive-selection on transcription factor-DNA interface elements scales with effective population size. The dataset and analysis pipeline provide a baseline for future studies of regulatory evolution across coding and noncoding regions.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12645836/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145503386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
De Novo Gene Emergence: Summary, Classification, and Challenges of Current Methods. 新生基因出现:总结、分类和当前方法的挑战。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf197
Anna Grandchamp, Margaux Aubel, Lars A Eicholt, Paul Roginski, Victor Luria, Amir Karger, Elias Dohmen

A novel mechanism of de novo gene origination from nongenic sequences was first proposed in the early 2000s. Subsequent studies have since provided evidence of de novo gene emergence across all domains of life, revealing its occurrence to be more frequent than initially anticipated. While studies mainly agree on the general concept of de novo emergence from nongenic DNA, the exact methods and definitions for detecting de novo genes differ significantly. Here, we provide a comprehensive step-by-step description of the most commonly used methods for de novo gene detection. In addition, we address the limitations of nomenclature and detection methods and clarify some complex concepts that are sometimes misused. This review is accompanied by the publication of a de novo gene annotation format to standardize the reporting of methodology, enable reproducibility and improve the comparability of datasets.

在21世纪初,一种新的非基因序列的从头基因起源机制首次被提出。随后的研究为新生基因在所有生命领域的出现提供了证据,揭示了其发生的频率比最初预期的要高。虽然研究主要认同非基因DNA新生的一般概念,但检测新生基因的确切方法和定义差异很大。在这里,我们提供了最常用的从头基因检测方法的全面一步一步的描述。此外,我们解决了命名和检测方法的局限性,并澄清了一些有时被误用的复杂概念。本综述同时发布了一种全新的基因注释格式,以规范方法报告,实现可重复性并提高数据集的可比性。
{"title":"De Novo Gene Emergence: Summary, Classification, and Challenges of Current Methods.","authors":"Anna Grandchamp, Margaux Aubel, Lars A Eicholt, Paul Roginski, Victor Luria, Amir Karger, Elias Dohmen","doi":"10.1093/gbe/evaf197","DOIUrl":"10.1093/gbe/evaf197","url":null,"abstract":"<p><p>A novel mechanism of de novo gene origination from nongenic sequences was first proposed in the early 2000s. Subsequent studies have since provided evidence of de novo gene emergence across all domains of life, revealing its occurrence to be more frequent than initially anticipated. While studies mainly agree on the general concept of de novo emergence from nongenic DNA, the exact methods and definitions for detecting de novo genes differ significantly. Here, we provide a comprehensive step-by-step description of the most commonly used methods for de novo gene detection. In addition, we address the limitations of nomenclature and detection methods and clarify some complex concepts that are sometimes misused. This review is accompanied by the publication of a de novo gene annotation format to standardize the reporting of methodology, enable reproducibility and improve the comparability of datasets.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12605812/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145344995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HJ Muller and the Relationship Between Sex Chromosome Degeneration and the Evolution of Dosage Compensation. HJ Muller与性染色体退化与剂量补偿进化的关系。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf195
Brian Charlesworth, Deborah Charlesworth

A lack of recombination in the heterogametic sex between parts or all of newly evolving sex chromosomes results in the gradual accumulation of deleterious mutations on proto-Y or proto-W chromosomes. This "genetic degeneration" is caused by several population genetic mechanisms. It can eventually lead to the loss of functionality and deletions of Y- or W-linked genes in species with male or female heterogamety, respectively, reducing the fitness of heterozygous XY males or ZW females. This creates selection to compensate for such degeneration. Contemporary studies of degeneration and dosage compensation are built on classical genetic work by HJ Muller, with molecular analyses of genomes and gene expression now revealing new details. We review these studies, integrating ideas about how degeneration and compensation evolve. We discuss whether these two processes evolve together, whether the initial changes involved in compensation occurred in individual sex-linked genes ("piecemeal"), and whether they were sex specific. We also discuss the idea that control of expression across larger chromosome regions reflects later changes, after increased expression of X- or Z-linked genes in both sexes favored reduced X expression in females (or Z expression in males with female heterogamety). We summarize the currently available empirical evidence and discuss difficulties involved in documenting the evolutionary changes that lead to the different types of dosage compensation, as well as limitations of the data for testing evolutionary hypotheses.

在异配子性中,部分或全部新进化的性染色体之间缺乏重组会导致原y染色体或原w染色体上有害突变的逐渐积累。这种“遗传退化”是由几种种群遗传机制引起的。在雄性或雌性异型配子的物种中,它最终会导致Y-或w -连锁基因的功能丧失和缺失,从而降低杂合的XY雄性或ZW雌性的适合度。这就产生了补偿这种退化的选择。当代退化和剂量补偿的研究是建立在HJ Muller的经典遗传工作基础上的,基因组和基因表达的分子分析现在揭示了新的细节。我们回顾了这些研究,整合了退化和补偿如何演变的想法。我们讨论了这两个过程是否一起进化,是否涉及补偿的初始变化发生在个体性别相关基因(“零敲碎打”),以及它们是否具有性别特异性。我们还讨论了在较大染色体区域的表达控制反映了后来的变化,在两性中X或Z连锁基因的表达增加后,女性中X的表达减少(或女性异种配子的男性中Z的表达减少)。我们总结了目前可用的经验证据,并讨论了记录导致不同类型剂量补偿的进化变化所涉及的困难,以及测试进化假设数据的局限性。
{"title":"HJ Muller and the Relationship Between Sex Chromosome Degeneration and the Evolution of Dosage Compensation.","authors":"Brian Charlesworth, Deborah Charlesworth","doi":"10.1093/gbe/evaf195","DOIUrl":"10.1093/gbe/evaf195","url":null,"abstract":"<p><p>A lack of recombination in the heterogametic sex between parts or all of newly evolving sex chromosomes results in the gradual accumulation of deleterious mutations on proto-Y or proto-W chromosomes. This \"genetic degeneration\" is caused by several population genetic mechanisms. It can eventually lead to the loss of functionality and deletions of Y- or W-linked genes in species with male or female heterogamety, respectively, reducing the fitness of heterozygous XY males or ZW females. This creates selection to compensate for such degeneration. Contemporary studies of degeneration and dosage compensation are built on classical genetic work by HJ Muller, with molecular analyses of genomes and gene expression now revealing new details. We review these studies, integrating ideas about how degeneration and compensation evolve. We discuss whether these two processes evolve together, whether the initial changes involved in compensation occurred in individual sex-linked genes (\"piecemeal\"), and whether they were sex specific. We also discuss the idea that control of expression across larger chromosome regions reflects later changes, after increased expression of X- or Z-linked genes in both sexes favored reduced X expression in females (or Z expression in males with female heterogamety). We summarize the currently available empirical evidence and discuss difficulties involved in documenting the evolutionary changes that lead to the different types of dosage compensation, as well as limitations of the data for testing evolutionary hypotheses.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12598394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145345042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Ocean of Opsins. opsin的海洋。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf189
Giacinto De Vivo, Eric Pelletier, Roberto Feuda, Salvatore D'Aniello

In this study, we explored the diversity and evolution of opsins using meta-omic data from the Tara Oceans and Tara Polar Circle expeditions, one of the largest marine datasets available. By using sequence similarity methods and phylogenetic analyses, we identified opsins across the different metazoan groups. Our results indicate that most of the opsin sequences belong to arthropods and vertebrates. We also detected sequences from all known opsin subfamilies, including r-opsin, c-opsin, xenopsin, and Group-4 opsins. Despite the broad taxonomic scope, no new opsin families were discovered; however, we provide valuable taxonomic insights into known opsin subfamilies and reinforce existing phylogenetic hypotheses. Additionally, we present novel opsin sequences from less-studied taxa, such as chaetognaths, rotifers, acoelomates, and tunicates, and which may serve as a valuable resource for future research into opsin function and diversity.

在这项研究中,我们利用来自塔拉海洋和塔拉极地考察的元组学数据(最大的海洋数据集之一)探索了视蛋白的多样性和进化。通过序列相似性方法和系统发育分析,我们确定了不同后生动物群体的视蛋白。我们的结果表明,大多数视蛋白序列属于节肢动物和脊椎动物。我们还检测了所有已知视蛋白亚家族的序列,包括r-视蛋白、c-视蛋白、xenopsin和Group-4视蛋白。尽管分类范围很广,但没有发现新的视蛋白科;然而,我们为已知的视蛋白亚家族提供了有价值的分类学见解,并加强了现有的系统发育假说。此外,我们从研究较少的分类群,如毛齿动物、轮虫、无腔动物和被囊动物中获得了新的视蛋白序列,这可能为未来研究视蛋白的功能和多样性提供有价值的资源。
{"title":"An Ocean of Opsins.","authors":"Giacinto De Vivo, Eric Pelletier, Roberto Feuda, Salvatore D'Aniello","doi":"10.1093/gbe/evaf189","DOIUrl":"10.1093/gbe/evaf189","url":null,"abstract":"<p><p>In this study, we explored the diversity and evolution of opsins using meta-omic data from the Tara Oceans and Tara Polar Circle expeditions, one of the largest marine datasets available. By using sequence similarity methods and phylogenetic analyses, we identified opsins across the different metazoan groups. Our results indicate that most of the opsin sequences belong to arthropods and vertebrates. We also detected sequences from all known opsin subfamilies, including r-opsin, c-opsin, xenopsin, and Group-4 opsins. Despite the broad taxonomic scope, no new opsin families were discovered; however, we provide valuable taxonomic insights into known opsin subfamilies and reinforce existing phylogenetic hypotheses. Additionally, we present novel opsin sequences from less-studied taxa, such as chaetognaths, rotifers, acoelomates, and tunicates, and which may serve as a valuable resource for future research into opsin function and diversity.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"17 11","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12584886/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145444673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural Rearrangements and Selection Promote Phenotypic Evolution in Anolis Lizards. 结构重排和选择促进了蜥蜴的表型进化。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf196
Raúl Araya-Donoso, Sarah M Baty, Jaime E Johnson, Eris Lasku, Jody M Taft, Rebecca E Fisher, Jonathan B Losos, Greer A Dolby, Kenro Kusumi, Anthony J Geneva

The genomic characteristics of adaptively radiated groups could contribute to their high species number and ecological disparity, by increasing their evolutionary potential. Here, we explored the genomic variation of Anolis lizards, focusing on three species with distinct phenotypes: Anolis auratus, one of the species with the longest tail; Anolis frenatus, one of the largest species; and Anolis carolinensis, one of the species that inhabits the coldest environments. We assembled and annotated two new chromosome-level reference genomes for A. auratus and A. frenatus and compared them with the available genomes of A. carolinensis and Anolis sagrei. We evaluated the presence of structural rearrangements, quantified the density of repeat elements, and identified potential signatures of positive selection in coding and regulatory regions. We detected substantial rearrangements in scaffolds 1, 2, and 3 of A. frenatus different from the other species, in which the rearrangement breakpoints corresponded to hotspots of developmental genes. Further, we detected an accumulation of repeats around key developmental genes in anoles and phrynosomatid outgroups. Finally, coding sequences and regulatory regions of genes relevant to development and physiology showed variation that could be associated with the unique phenotypes of the analyzed species. Our results show examples of the hierarchical genomic variation within anoles that could provide the substrate that promoted phenotypic disparity and contributed to their adaptive radiation.

适应辐射群体的基因组特征可能会增加其进化潜力,从而导致其物种数量多,生态差异大。在这里,我们探索了蜥蜴的基因组变异,重点研究了三个具有不同表型的物种:A. auratus,尾巴最长的物种之一;A. frenatus,最大的种类之一;以及栖息在最冷环境中的卡罗林杉属。我们组装并注释了auratus和frenatus的两个新的染色体水平参考基因组,并将其与A. carolinensis和A. sagrei的现有基因组进行了比较。我们评估了结构重排的存在,量化了重复元件的密度,并确定了编码和调控区域中正选择的潜在特征。我们检测到frenatus的支架1、2和3存在与其他物种不同的重排,重排断点对应于发育基因热点。此外,我们还检测到变色虫和毛虫外群中关键发育基因周围的重复序列积累。最后,与发育和生理相关的基因编码序列和调控区域显示出可能与所分析物种的独特表型相关的变异。我们的研究结果显示了蜥蜴内部的等级基因组变异的例子,这可能提供了促进表型差异的基质,并有助于它们的适应性辐射。
{"title":"Structural Rearrangements and Selection Promote Phenotypic Evolution in Anolis Lizards.","authors":"Raúl Araya-Donoso, Sarah M Baty, Jaime E Johnson, Eris Lasku, Jody M Taft, Rebecca E Fisher, Jonathan B Losos, Greer A Dolby, Kenro Kusumi, Anthony J Geneva","doi":"10.1093/gbe/evaf196","DOIUrl":"10.1093/gbe/evaf196","url":null,"abstract":"<p><p>The genomic characteristics of adaptively radiated groups could contribute to their high species number and ecological disparity, by increasing their evolutionary potential. Here, we explored the genomic variation of Anolis lizards, focusing on three species with distinct phenotypes: Anolis auratus, one of the species with the longest tail; Anolis frenatus, one of the largest species; and Anolis carolinensis, one of the species that inhabits the coldest environments. We assembled and annotated two new chromosome-level reference genomes for A. auratus and A. frenatus and compared them with the available genomes of A. carolinensis and Anolis sagrei. We evaluated the presence of structural rearrangements, quantified the density of repeat elements, and identified potential signatures of positive selection in coding and regulatory regions. We detected substantial rearrangements in scaffolds 1, 2, and 3 of A. frenatus different from the other species, in which the rearrangement breakpoints corresponded to hotspots of developmental genes. Further, we detected an accumulation of repeats around key developmental genes in anoles and phrynosomatid outgroups. Finally, coding sequences and regulatory regions of genes relevant to development and physiology showed variation that could be associated with the unique phenotypes of the analyzed species. Our results show examples of the hierarchical genomic variation within anoles that could provide the substrate that promoted phenotypic disparity and contributed to their adaptive radiation.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12596200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145372300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring Domestic Goat Demographic History Through Ancient Genome Imputation. 通过古基因组代入推断家山羊人口历史。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf181
Jolijn A M Erven, Alice Etourneau, Marjan Mashkour, Mahesh Neupane, Phillipe Bardou, Alessandra Stella, Andrea Talenti, Clet Wandui Masiga, Curtis P Van Tassell, Emily Clark, François Pompanon, Licia Colli, Marcel Amills, Marco Milanesi, Paola Crepaldi, Bertrand Servin, Benjamin D Rosen, Gwenola Tosser-Klopp, Kevin G Daly

Goats were among the earliest managed animals, making them a natural model to explore the genetic consequences of domestication. However, a challenge in ancient genomic analysis is the relatively low genome coverage for most samples, limiting analysis to pseudohaploid genotypes. Genotype imputation offers potential to alleviate this limitation by improving information content and accuracy in low coverage genomes. To test this, we used published high coverage (>8✕) goat palaeogenomes, imputing downsampled genomes using the VarGoats dataset (1,372 individuals) as a reference panel. Measuring concordance between imputed and high coverage genotypes, we find high concordance after filtering for common (>5%), high confidence variants, with 0.5✕ genomes reaching >0.97 concordance. There is a trade-off between coverage, genotype probability (GP) thresholds, and genotype recovery, where higher coverage and more lenient GP thresholds result in higher recovery, and a reduction in heterozygous false-positive rates with stricter thresholds. We then imputed 36 goat palaeogenomes with ≥0.5✕ coverage to examine runs-of-homozygosity (ROH) and identity-by-descent (IBD) patterns. Using a novel approach combining ROH profiles across tools, we find that among Neolithic goats, ROH increases with distance from the Zagros Mountains, suggesting a large effect of the initial dispersal of managed herds. Inbreeding levels decrease across Southwest Asia in more recent periods. IBD mirrored this pattern, with less relatedness in the early herding site of Ganj Dareh compared to higher relatedness in goats from later in the dispersal process. These findings provide insights into the genetic consequences of early goat management on demography, and confirm the utility of imputation in leveraging low coverage palaeogenomes.

山羊是最早被管理的动物之一,这使它们成为探索驯化的遗传后果的自然模型。然而,古代基因组分析的一个挑战是大多数样本的基因组覆盖率相对较低,限制了对假单倍体基因型的分析。基因型插补通过提高低覆盖率基因组的信息含量和准确性提供了缓解这一限制的潜力。为了测试这一点,我们使用已发表的高覆盖率(>8✕)山羊古基因组,使用vargoat数据集(1,372个个体)作为参考面板输入下采样基因组。测量输入基因型和高覆盖率基因型之间的一致性,我们发现在过滤常见(> - 5%)、高置信度变异后具有高一致性,0.5✕基因组达到> - 0.97一致性。在覆盖率、基因型概率(GP)阈值和基因型恢复之间存在一种权衡,其中更高的覆盖率和更宽松的GP阈值导致更高的恢复,而更严格的阈值则会减少杂合假阳性率。然后,我们输入了36个覆盖率≥0.5✕的山羊古基因组,以检查纯合子序列(ROH)和血统识别(IBD)模式。利用一种结合不同工具的ROH剖面的新方法,我们发现在新石器时代的山羊中,ROH随着距离扎格罗斯山脉的距离而增加,这表明管理羊群的初始分散有很大的影响。近年来,西南亚的近亲繁殖水平有所下降。IBD反映了这一模式,在Ganj Dareh的早期牧区,IBD的亲缘性较低,而在分散过程的后期,IBD的亲缘性较高。这些发现为早期山羊管理对人口统计学的遗传影响提供了见解,并证实了代入在利用低覆盖率古基因组方面的效用。
{"title":"Inferring Domestic Goat Demographic History Through Ancient Genome Imputation.","authors":"Jolijn A M Erven, Alice Etourneau, Marjan Mashkour, Mahesh Neupane, Phillipe Bardou, Alessandra Stella, Andrea Talenti, Clet Wandui Masiga, Curtis P Van Tassell, Emily Clark, François Pompanon, Licia Colli, Marcel Amills, Marco Milanesi, Paola Crepaldi, Bertrand Servin, Benjamin D Rosen, Gwenola Tosser-Klopp, Kevin G Daly","doi":"10.1093/gbe/evaf181","DOIUrl":"10.1093/gbe/evaf181","url":null,"abstract":"<p><p>Goats were among the earliest managed animals, making them a natural model to explore the genetic consequences of domestication. However, a challenge in ancient genomic analysis is the relatively low genome coverage for most samples, limiting analysis to pseudohaploid genotypes. Genotype imputation offers potential to alleviate this limitation by improving information content and accuracy in low coverage genomes. To test this, we used published high coverage (>8✕) goat palaeogenomes, imputing downsampled genomes using the VarGoats dataset (1,372 individuals) as a reference panel. Measuring concordance between imputed and high coverage genotypes, we find high concordance after filtering for common (>5%), high confidence variants, with 0.5✕ genomes reaching >0.97 concordance. There is a trade-off between coverage, genotype probability (GP) thresholds, and genotype recovery, where higher coverage and more lenient GP thresholds result in higher recovery, and a reduction in heterozygous false-positive rates with stricter thresholds. We then imputed 36 goat palaeogenomes with ≥0.5✕ coverage to examine runs-of-homozygosity (ROH) and identity-by-descent (IBD) patterns. Using a novel approach combining ROH profiles across tools, we find that among Neolithic goats, ROH increases with distance from the Zagros Mountains, suggesting a large effect of the initial dispersal of managed herds. Inbreeding levels decrease across Southwest Asia in more recent periods. IBD mirrored this pattern, with less relatedness in the early herding site of Ganj Dareh compared to higher relatedness in goats from later in the dispersal process. These findings provide insights into the genetic consequences of early goat management on demography, and confirm the utility of imputation in leveraging low coverage palaeogenomes.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"17 11","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12598287/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145481570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Species Tree Branch Length Estimation despite Incomplete Lineage Sorting, Duplication, and Loss. 不完全谱系分类、重复和丢失情况下的物种树枝长度估计。
IF 2.8 2区 生物学 Q2 EVOLUTIONARY BIOLOGY Pub Date : 2025-10-29 DOI: 10.1093/gbe/evaf200
Yasamin Tabatabaee, Chao Zhang, Shayesteh Arasti, Siavash Mirarab

Phylogenetic branch lengths are essential for many analyses, such as estimating divergence times, analyzing rate changes, and studying adaptation. However, true gene tree heterogeneity due to incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer can complicate the estimation of species tree branch lengths. While several tools exist for estimating the topology of a species tree addressing various causes of gene tree discordance, much less attention has been paid to branch length estimation on multi-locus datasets. For single-copy gene trees, some methods are available that summarize gene tree branch lengths onto a species tree, including coalescent-based methods that account for heterogeneity due to incomplete lineage sorting. However, no such branch length estimation method exists for multi-copy gene family trees that have evolved with gene duplication and loss. To address this gap, we introduce the CASTLES-Pro algorithm for estimating species tree branch lengths while accounting for both gene duplication and loss and incomplete lineage sorting. CASTLES-Pro improves on the existing coalescent-based branch length estimation method CASTLES by increasing its accuracy for single-copy gene trees and extending it to handle multi-copy ones. Our simulation studies show that CASTLES-Pro is generally more accurate than alternatives, eliminating the systematic bias toward overestimating terminal branch lengths often observed when using concatenation. Moreover, while not theoretically designed for horizontal gene transfer, we show that CASTLES-Pro is relatively robust to random horizontal gene transfer, though its accuracy can degrade at the highest levels of horizontal gene transfer.

系统发育分支长度在许多分析中都是必不可少的,例如估计分化时间、分析速率变化和研究适应性。然而,由于谱系分类不完整、基因复制和丢失以及基因水平转移等原因,真正的基因树异质性会使物种树枝长度的估计复杂化。虽然有几种工具用于估计物种树的拓扑结构,以解决基因树不一致的各种原因,但对多位点数据集的分支长度估计的关注较少。对于单拷贝基因树,一些方法可以将基因树分支长度总结到物种树上,包括基于聚结的方法,该方法可以解释由于谱系分类不完整而导致的异质性。然而,对于基因重复和丢失的多拷贝基因家族树,还没有这样的分支长度估计方法。为了解决这一差距,我们引入了CASTLES-Pro算法来估计物种树枝长度,同时考虑到基因复制和丢失以及不完整的谱系分类。CASTLES- pro对现有的基于聚结的分支长度估计方法CASTLES进行了改进,提高了其对单拷贝基因树的准确性,并将其扩展到多拷贝基因树。我们的模拟研究表明,CASTLES-Pro通常比其他选择更准确,消除了在使用串联时经常观察到的对终端分支长度高估的系统性偏见。此外,虽然理论上不是为水平基因转移设计的,但我们表明CASTLES-Pro对随机水平基因转移相对稳健,尽管其准确性在水平基因转移的最高水平时会下降。
{"title":"Species Tree Branch Length Estimation despite Incomplete Lineage Sorting, Duplication, and Loss.","authors":"Yasamin Tabatabaee, Chao Zhang, Shayesteh Arasti, Siavash Mirarab","doi":"10.1093/gbe/evaf200","DOIUrl":"10.1093/gbe/evaf200","url":null,"abstract":"<p><p>Phylogenetic branch lengths are essential for many analyses, such as estimating divergence times, analyzing rate changes, and studying adaptation. However, true gene tree heterogeneity due to incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer can complicate the estimation of species tree branch lengths. While several tools exist for estimating the topology of a species tree addressing various causes of gene tree discordance, much less attention has been paid to branch length estimation on multi-locus datasets. For single-copy gene trees, some methods are available that summarize gene tree branch lengths onto a species tree, including coalescent-based methods that account for heterogeneity due to incomplete lineage sorting. However, no such branch length estimation method exists for multi-copy gene family trees that have evolved with gene duplication and loss. To address this gap, we introduce the CASTLES-Pro algorithm for estimating species tree branch lengths while accounting for both gene duplication and loss and incomplete lineage sorting. CASTLES-Pro improves on the existing coalescent-based branch length estimation method CASTLES by increasing its accuracy for single-copy gene trees and extending it to handle multi-copy ones. Our simulation studies show that CASTLES-Pro is generally more accurate than alternatives, eliminating the systematic bias toward overestimating terminal branch lengths often observed when using concatenation. Moreover, while not theoretically designed for horizontal gene transfer, we show that CASTLES-Pro is relatively robust to random horizontal gene transfer, though its accuracy can degrade at the highest levels of horizontal gene transfer.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"17 11","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648238/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145603891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome Biology and Evolution
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1