Fitness landscapes provide a rigorous mathematical framework for analyzing evolutionary dynamics, including the study of epistasis, the main obstacle to predicting phenotype from genotype. In 2011, Poelwijk et al. formulated a foundational theorem stating that in any multi-peaked fitness landscape, "at least two mutations exhibit reciprocal sign epistasis" (Poelwijk et al., J. Theor. Biol., 272:141). The proof relied on the implicit assumption that neutral mutations are absent, commonly accepted in theoretical studies in evolutionary biology. In this study, we extend Poelwijk et al.'s analysis by incorporating genotypes with equal fitness, specifically, accounting for neutral mutations. We demonstrate that when neutral mutations are considered, conventional pairwise reciprocal sign epistasis (RSE) may be entirely absent from a multi-peaked landscape. Instead, RSE is guaranteed only when considering "distant" RSE defined through composite mutations, wherein groups of mutations are treated collectively across all their possible combinations. Applying these concepts to empirical fitness landscapes faces a practical limitation: phenotypic measurements contain experimental noise, making some mutational effects statistically indistinguishable from zero. Under such conditions, statistically significant detection of RSE in multi-peaked landscapes may be impossible even when composite mutations are considered. Theoretically, our findings imply that in the presence of neutral mutations, compensatory mutations in a multi-peaked fitness landscape need not be adjacent; rather, compensation can occur following one or more neutral steps along an evolutionary path. Practically, in real-world scenarios where fitness measurements contain uncertainty, there may be a fundamental technical limitation to detecting RSE in a statistically significant manner within multi-peaked landscapes.
适应度景观为分析进化动力学提供了一个严格的数学框架,包括上位性研究,这是预测基因型表型的主要障碍。2011年,Poelwijk等人提出了一个基本定理,指出在任何多峰适应度景观中,“至少有两个突变表现出互负符号优势”(Poelwijk et al., J. Theor。医学杂志。272:141)。这一证明依赖于中性突变不存在的隐含假设,这在进化生物学的理论研究中被普遍接受。在本研究中,我们扩展了Poelwijk等人的分析,纳入了具有相同适应度的基因型,特别是考虑了中性突变。我们证明,当考虑中性突变时,传统的成对互反符号上位性(RSE)可能完全不存在于多峰景观中。相反,只有在考虑通过复合突变定义的“远距离”RSE时才能保证RSE,其中突变组在所有可能的组合中被集体处理。将这些概念应用于经验适应度景观面临着一个实际的限制:表型测量包含实验噪声,使得一些突变效应在统计上与零无法区分。在这种情况下,即使考虑复合突变,也不可能在多峰景观中检测到具有统计学意义的RSE。从理论上讲,我们的研究结果表明,在存在中性突变的情况下,多峰适应度景观中的补偿性突变不必相邻;相反,补偿可以沿着进化路径的一个或多个中立步骤发生。实际上,在适合度测量包含不确定性的现实场景中,在多峰景观中以统计显著的方式检测RSE可能存在基本的技术限制。
{"title":"Sign epistasis can be absent in multi-peaked landscapes with neutral mutations.","authors":"Dmitry N Ivankov, Evgenii M Zorin","doi":"10.1093/gbe/evag024","DOIUrl":"https://doi.org/10.1093/gbe/evag024","url":null,"abstract":"<p><p>Fitness landscapes provide a rigorous mathematical framework for analyzing evolutionary dynamics, including the study of epistasis, the main obstacle to predicting phenotype from genotype. In 2011, Poelwijk et al. formulated a foundational theorem stating that in any multi-peaked fitness landscape, \"at least two mutations exhibit reciprocal sign epistasis\" (Poelwijk et al., J. Theor. Biol., 272:141). The proof relied on the implicit assumption that neutral mutations are absent, commonly accepted in theoretical studies in evolutionary biology. In this study, we extend Poelwijk et al.'s analysis by incorporating genotypes with equal fitness, specifically, accounting for neutral mutations. We demonstrate that when neutral mutations are considered, conventional pairwise reciprocal sign epistasis (RSE) may be entirely absent from a multi-peaked landscape. Instead, RSE is guaranteed only when considering \"distant\" RSE defined through composite mutations, wherein groups of mutations are treated collectively across all their possible combinations. Applying these concepts to empirical fitness landscapes faces a practical limitation: phenotypic measurements contain experimental noise, making some mutational effects statistically indistinguishable from zero. Under such conditions, statistically significant detection of RSE in multi-peaked landscapes may be impossible even when composite mutations are considered. Theoretically, our findings imply that in the presence of neutral mutations, compensatory mutations in a multi-peaked fitness landscape need not be adjacent; rather, compensation can occur following one or more neutral steps along an evolutionary path. Practically, in real-world scenarios where fitness measurements contain uncertainty, there may be a fundamental technical limitation to detecting RSE in a statistically significant manner within multi-peaked landscapes.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146085664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Domestic organisms provide valuable models for studying the impact of population bottlenecks, inbreeding, and artificial selection on the accumulation of deleterious variants. While previous studies largely focused on coding variants, our study investigated both coding and non-coding contributions to genetic load in diverse chicken breeds, revealing the consequences of inbreeding and artificial selection on genome-wide patterns of deleterious variation. Using representative chicken populations with different selection histories, we show that domestication processes significantly impact the genetic load in chicken populations. Village chickens, which have experienced only the initial domestication, exhibit comparable levels of neutral heterozygosity and realized load as their wild progenitor Red Jungle Fowl. In contrast, breed chickens that have undergone more intense artificial selection show a significant decrease in neutral heterozygosity, an increase in the ratio of zerofold to fourfold heterozygosity, and a higher realized genetic load in both coding and non-coding regions. However, signals of purging of loss-of-function and non-coding deleterious variants were also detected in domestic chicken. Inbreeding is a major contributor to the increase of genome-wide realized load. We found selection against recently inbred individuals carrying long ROHs covering more coding regions, and an enrichment of homozygous non-coding deleterious variants in ROHs of no less than 2Mb. Additionally, we found that artificial selection drastically elevated the relative allele frequency of deleterious variants within sweep regions. These findings have implications for the importance of genetic background evaluation of breeding flocks and strategic management to maintain long-term health in domestic populations.
{"title":"Increased genetic load in breed chickens and reduced coding content in long ROH.","authors":"Ruoshi Huang, Ying Zhen","doi":"10.1093/gbe/evag015","DOIUrl":"https://doi.org/10.1093/gbe/evag015","url":null,"abstract":"<p><p>Domestic organisms provide valuable models for studying the impact of population bottlenecks, inbreeding, and artificial selection on the accumulation of deleterious variants. While previous studies largely focused on coding variants, our study investigated both coding and non-coding contributions to genetic load in diverse chicken breeds, revealing the consequences of inbreeding and artificial selection on genome-wide patterns of deleterious variation. Using representative chicken populations with different selection histories, we show that domestication processes significantly impact the genetic load in chicken populations. Village chickens, which have experienced only the initial domestication, exhibit comparable levels of neutral heterozygosity and realized load as their wild progenitor Red Jungle Fowl. In contrast, breed chickens that have undergone more intense artificial selection show a significant decrease in neutral heterozygosity, an increase in the ratio of zerofold to fourfold heterozygosity, and a higher realized genetic load in both coding and non-coding regions. However, signals of purging of loss-of-function and non-coding deleterious variants were also detected in domestic chicken. Inbreeding is a major contributor to the increase of genome-wide realized load. We found selection against recently inbred individuals carrying long ROHs covering more coding regions, and an enrichment of homozygous non-coding deleterious variants in ROHs of no less than 2Mb. Additionally, we found that artificial selection drastically elevated the relative allele frequency of deleterious variants within sweep regions. These findings have implications for the importance of genetic background evaluation of breeding flocks and strategic management to maintain long-term health in domestic populations.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146085632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The altered life history strategies of heterotrophic organisms often leave a profound genetic footprint on energy metabolism related functions. In parasitic plants, the reliance on host-derived nutrients and loss of photosynthesis in holoparasites have led to highly degraded to absent plastid genomes, but its impact on mitochondrial genome (mitogenome) evolution has remained controversial. By examining mitogenomes from 45 Orobanchaceae species including three independent transitions to holoparasitism and key evolutionary intermediates, we identified measurable and predictable genetic alterations in genomic shuffling, RNA editing, and intracellular (IGT) and horizontal gene transfer (HGT) en route to a nonphotosynthetic lifestyle. In-depth comparative analyses revealed DNA recombination and repair processes, especially conversion of RNA-mediated retroprocessing, as significant drivers for genome structure evolution. In particular, we identified a novel RNA-mediated IGT and HGT mechanism, which has not been demonstrated previously in cross-species and inter-organelle transfers. We propose a dosage effect mechanism to explain the biased transferability of plastid DNA to mitochondria across green plants, especially in heterotrophic lineages like parasites and mycoheterotrophs. Evolutionary rates scaled with these genomic changes, but the direction and strength of selection varied substantially among genes and clades, resulting in high contingency in mitochondrial genome evolution. Finally, we summarize mitochondrial evolutionary trends in Orobanchaceae that are potentially generalizable to other heterotrophic plants: increased recombination and repair activities, rather than relaxed selection alone, lead to differentiated genome structure compared to free-living species.
{"title":"Recombination and retroprocessing in broomrapes reveal RNA-mediated gene transfer mechanism and a generalizable model for mitochondrial evolution in heterotrophic plants.","authors":"Liming Cai, Justin C Havird, Robert K Jansen","doi":"10.1093/gbe/evag025","DOIUrl":"10.1093/gbe/evag025","url":null,"abstract":"<p><p>The altered life history strategies of heterotrophic organisms often leave a profound genetic footprint on energy metabolism related functions. In parasitic plants, the reliance on host-derived nutrients and loss of photosynthesis in holoparasites have led to highly degraded to absent plastid genomes, but its impact on mitochondrial genome (mitogenome) evolution has remained controversial. By examining mitogenomes from 45 Orobanchaceae species including three independent transitions to holoparasitism and key evolutionary intermediates, we identified measurable and predictable genetic alterations in genomic shuffling, RNA editing, and intracellular (IGT) and horizontal gene transfer (HGT) en route to a nonphotosynthetic lifestyle. In-depth comparative analyses revealed DNA recombination and repair processes, especially conversion of RNA-mediated retroprocessing, as significant drivers for genome structure evolution. In particular, we identified a novel RNA-mediated IGT and HGT mechanism, which has not been demonstrated previously in cross-species and inter-organelle transfers. We propose a dosage effect mechanism to explain the biased transferability of plastid DNA to mitochondria across green plants, especially in heterotrophic lineages like parasites and mycoheterotrophs. Evolutionary rates scaled with these genomic changes, but the direction and strength of selection varied substantially among genes and clades, resulting in high contingency in mitochondrial genome evolution. Finally, we summarize mitochondrial evolutionary trends in Orobanchaceae that are potentially generalizable to other heterotrophic plants: increased recombination and repair activities, rather than relaxed selection alone, lead to differentiated genome structure compared to free-living species.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many molecular processes (e.g., replication, recombination, and transcription) use DNA as a template molecule, which may lead to an increase or decrease in the likelihood of spontaneous mutation and/or repair of mutations to this key information storage molecule. In the case of transcription, both positive and negative correlations with the likelihood of mutation have been observed across species, which have formed the basis of two proposed mechanistic models: transcription-associated mutagenesis and transcription-coupled repair. Here, we examine the patterns of spontaneous mutations in regions of low and high transcription in the aquatic microcrustacean, Daphnia. By mapping events from a long-term mutation accumulation study (n = 66 lineages derived from 9 different genotypes from three populations) with multiple, large-scale publicly-available RNA-seq datasets, we find mutations are more frequently observed in regions of high transcription in D. magna, as well as in the sister taxon, D. pulex. The results are robust across mutation types (base substitutions, insertions, and deletions) and among transcriptional profiles (across developmental stages and environmental conditions). Overall, the positive correlation was robust to different methodological approaches and when controlling for other genomic features (like GC-content). Based on our observations, transcription-associated mutagenesis provides a more likely explanation for the positive relationship between mutation accumulation and transcription levels observed in Daphnia. Characterizing such patterns is important for understanding the evolution of genes, differentially expressed regions of the genome, and of the mutation rate.
{"title":"Spontaneous mutations occur more in highly transcribed regions in Daphnia.","authors":"Jeremy E Coate, Eddie K H Ho, Sarah Schaack","doi":"10.1093/gbe/evag021","DOIUrl":"https://doi.org/10.1093/gbe/evag021","url":null,"abstract":"<p><p>Many molecular processes (e.g., replication, recombination, and transcription) use DNA as a template molecule, which may lead to an increase or decrease in the likelihood of spontaneous mutation and/or repair of mutations to this key information storage molecule. In the case of transcription, both positive and negative correlations with the likelihood of mutation have been observed across species, which have formed the basis of two proposed mechanistic models: transcription-associated mutagenesis and transcription-coupled repair. Here, we examine the patterns of spontaneous mutations in regions of low and high transcription in the aquatic microcrustacean, Daphnia. By mapping events from a long-term mutation accumulation study (n = 66 lineages derived from 9 different genotypes from three populations) with multiple, large-scale publicly-available RNA-seq datasets, we find mutations are more frequently observed in regions of high transcription in D. magna, as well as in the sister taxon, D. pulex. The results are robust across mutation types (base substitutions, insertions, and deletions) and among transcriptional profiles (across developmental stages and environmental conditions). Overall, the positive correlation was robust to different methodological approaches and when controlling for other genomic features (like GC-content). Based on our observations, transcription-associated mutagenesis provides a more likely explanation for the positive relationship between mutation accumulation and transcription levels observed in Daphnia. Characterizing such patterns is important for understanding the evolution of genes, differentially expressed regions of the genome, and of the mutation rate.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146051630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Entamoeba is the amoebozoan parasite commonly found in the intestines of animals. E. marina is the first exception isolated from marine sediments, possibly adapting from animal intestines to the sea. However, the evolutionary process of E. marina remains uncertain due to the lack of a genome sequence. Here, we present the de novo genome and transcriptome of E. marina using Oxford Nanopore MinION and Illumina HiSeq/MiSeq. The genome of E. marina is approximately 37.5 Mbp in length and consisted of 202 contigs, which is the second longest next to E. invadens. E. marina showed significant reduction in the major virulence-associated gene families, including cysteine proteases, lysosomal enzyme transporters, and surface galactose/N-acetylglucosamine-specific lectins, suggesting diversification, more specifically reduction of pathogenicity-related genes. Genome and RNA-seq analyses also indicated genes either conserved throughout eukaryotes or laterally transferred from prokaryotes, and potentially responsible for salt tolerance. Our study provides insights into the mechanism underlying the lifestyle changes in the evolution of parasitic eukaryotes.
内阿米巴是一种常见于动物肠道的变形虫寄生虫。E. marina是第一个从海洋沉积物中分离出来的例外,可能是从动物肠道适应海洋的。然而,由于缺乏基因组序列,其进化过程仍然不确定。在这里,我们使用Oxford Nanopore MinION和Illumina HiSeq/MiSeq展示了E. marina的从头基因组和转录组。沙蚕的基因组长度约为37.5 Mbp,由202个contigs组成,是仅次于入侵沙蚕的第二长基因组。E. marina显示主要毒力相关基因家族的显著减少,包括半胱氨酸蛋白酶、溶酶体酶转运蛋白和表面半乳糖/ n -乙酰氨基葡萄糖特异性凝集素,这表明致病性相关基因的多样化,更具体地减少。基因组和RNA-seq分析也表明,基因要么在真核生物中保守,要么从原核生物中横向转移,可能与耐盐性有关。我们的研究为寄生真核生物进化中生活方式改变的机制提供了见解。
{"title":"Draft genome of Entamoeba marina provides insights into the attenuation of pathogenicity and adaptation to the marine environment.","authors":"Tetsuro Kawano-Sugaya, Shinji Izumiyama, Tomoyoshi Nozaki","doi":"10.1093/gbe/evag020","DOIUrl":"https://doi.org/10.1093/gbe/evag020","url":null,"abstract":"<p><p>Entamoeba is the amoebozoan parasite commonly found in the intestines of animals. E. marina is the first exception isolated from marine sediments, possibly adapting from animal intestines to the sea. However, the evolutionary process of E. marina remains uncertain due to the lack of a genome sequence. Here, we present the de novo genome and transcriptome of E. marina using Oxford Nanopore MinION and Illumina HiSeq/MiSeq. The genome of E. marina is approximately 37.5 Mbp in length and consisted of 202 contigs, which is the second longest next to E. invadens. E. marina showed significant reduction in the major virulence-associated gene families, including cysteine proteases, lysosomal enzyme transporters, and surface galactose/N-acetylglucosamine-specific lectins, suggesting diversification, more specifically reduction of pathogenicity-related genes. Genome and RNA-seq analyses also indicated genes either conserved throughout eukaryotes or laterally transferred from prokaryotes, and potentially responsible for salt tolerance. Our study provides insights into the mechanism underlying the lifestyle changes in the evolution of parasitic eukaryotes.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146029371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Dauphin, Tobias Baril, Emmanuelle Morin, Ursula Oggenfuss, Stephanie Pfister, Maira De Freitas Pereira, Igor V Grigoriev, Annegret Kohler, Francis Martin, Daniel Croll, Martina Peter
Transposable elements (TEs) play crucial roles in genome evolution and ecological adaptation in fungi, yet their dynamics in ectomycorrhizal species remain poorly understood. Cenococcum geophilum, the most widespread ectomycorrhizal fungus in boreal and temperate forests with its large, repeat-rich genome, represents an ideal system to investigate TE-mediated adaptation to the physical environment and symbiotic lifestyle. However, previous studies have been limited by fragmented genome assemblies that prevented the resolution of repeat-rich regions. We assembled a telomere-to-telomere reference genome of C. geophilum strain 1.58 using PacBio HiFi and Hi-C datasets, resulting in a 178.54 Mbp genome with seven contiguous chromosomes. We identified 14,145 genes and over 78% of the genome consists of transposable elements (TEs). Of these, 94% are affected by repeat-induced point mutations (RIP), a genome defence mechanism that acts during the sexual reproduction phase, indicating cryptic or ancient sexual reproduction in this putatively asexual fungus. LTR retrotransposons, LINEs, and DNA transposons dominated, with three TE families (Ty3, Ty1, and Tad1) contributing over 60% of the genome size, indicating recent transposition bursts. Screening of 15 additional C. geophilum strains revealed recent and lineage-specific TE expansions, implying that several TEs escaped the RIP machinery and retained potential activity. Supporting TE activity in the context of symbiosis, we found 56 TEs differentially transcribed between ectomycorrhizal and free-living mycelium tissues. An even higher number (n = 66) of TEs were differentially expressed between stress resistance morphology (i.e., sclerotia) and free-living mycelium. This supports that TEs are differentially regulated as a response to symbiotic and stress-related conditions. Our results demonstrate that the C. geophilum genome expansion was driven by a few lineage-specific TE families in recent history, with high RIP activity attesting to sexual reproduction. We also provide insights how TEs could respond to lifestyle transitions and traits associated with desiccation resistance.
{"title":"Chromosome-scale genome assembly of the most abundant ectomycorrhizal fungus Cenococcum geophilum reveals massive TE expansion and RIP defence mechanism.","authors":"Benjamin Dauphin, Tobias Baril, Emmanuelle Morin, Ursula Oggenfuss, Stephanie Pfister, Maira De Freitas Pereira, Igor V Grigoriev, Annegret Kohler, Francis Martin, Daniel Croll, Martina Peter","doi":"10.1093/gbe/evag017","DOIUrl":"https://doi.org/10.1093/gbe/evag017","url":null,"abstract":"<p><p>Transposable elements (TEs) play crucial roles in genome evolution and ecological adaptation in fungi, yet their dynamics in ectomycorrhizal species remain poorly understood. Cenococcum geophilum, the most widespread ectomycorrhizal fungus in boreal and temperate forests with its large, repeat-rich genome, represents an ideal system to investigate TE-mediated adaptation to the physical environment and symbiotic lifestyle. However, previous studies have been limited by fragmented genome assemblies that prevented the resolution of repeat-rich regions. We assembled a telomere-to-telomere reference genome of C. geophilum strain 1.58 using PacBio HiFi and Hi-C datasets, resulting in a 178.54 Mbp genome with seven contiguous chromosomes. We identified 14,145 genes and over 78% of the genome consists of transposable elements (TEs). Of these, 94% are affected by repeat-induced point mutations (RIP), a genome defence mechanism that acts during the sexual reproduction phase, indicating cryptic or ancient sexual reproduction in this putatively asexual fungus. LTR retrotransposons, LINEs, and DNA transposons dominated, with three TE families (Ty3, Ty1, and Tad1) contributing over 60% of the genome size, indicating recent transposition bursts. Screening of 15 additional C. geophilum strains revealed recent and lineage-specific TE expansions, implying that several TEs escaped the RIP machinery and retained potential activity. Supporting TE activity in the context of symbiosis, we found 56 TEs differentially transcribed between ectomycorrhizal and free-living mycelium tissues. An even higher number (n = 66) of TEs were differentially expressed between stress resistance morphology (i.e., sclerotia) and free-living mycelium. This supports that TEs are differentially regulated as a response to symbiotic and stress-related conditions. Our results demonstrate that the C. geophilum genome expansion was driven by a few lineage-specific TE families in recent history, with high RIP activity attesting to sexual reproduction. We also provide insights how TEs could respond to lifestyle transitions and traits associated with desiccation resistance.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146018186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previously proposed chronologies of amino acid incorporation into the genetic code rely on consensus rankings derived from prebiotic synthesis experiments, biosynthetic pathways, or genomic trends. However, the role of intrinsic molecular properties in shaping amino acid recruitment remains largely underexplored. In this study, we reconstruct a complexity-based amino acid chronology by integrating sixteen molecular complexity metrics from chemical graph and information theory. Unlike approaches influenced by environmental variability, detection biases, or the evolutionary constraints of genome-based chronologies, our method provides a perspective on amino acid incorporation independent of these factors. Instead of imposing a linear ranking, we derive a minimum spanning tree capturing complexity-based relationships between amino acids. The resulting hierarchy places structurally simple amino acids in basal positions, while biosynthetically complex residues appear later, aligning with existing prebiotic and genomic chronologies. Furthermore, amino acids positioned closer in the complexity space exhibit significantly greater mutational connectivity than expected by chance, suggesting that molecular complexity reflects underlying structural considerations that constrained the genetic code's evolutionary pathways. This supports the idea that the code evolved not only to maintain biochemical stability but also to facilitate complexity-preserving substitutions, ensuring smooth adaptive transitions while minimizing energetic cost differences. Additionally, molecular complexity significantly correlates with amino acid enrichment in LUCA's inferred proteome, reinforcing its role as a fundamental constraint on early protein evolution. Our approach, rooted in intrinsic molecular properties rather than external contingencies, offers new insights into the constraints shaping the genetic code and expands the scope for identifying universal principles of biochemical evolution.
{"title":"Molecular Complexity Constrained Early Amino Acid Recruitment into the Genetic code.","authors":"Syeda Ameena Hashmi, Hamed Chok, Ricardo Cabrera, Celia Blanco","doi":"10.1093/gbe/evag012","DOIUrl":"https://doi.org/10.1093/gbe/evag012","url":null,"abstract":"<p><p>Previously proposed chronologies of amino acid incorporation into the genetic code rely on consensus rankings derived from prebiotic synthesis experiments, biosynthetic pathways, or genomic trends. However, the role of intrinsic molecular properties in shaping amino acid recruitment remains largely underexplored. In this study, we reconstruct a complexity-based amino acid chronology by integrating sixteen molecular complexity metrics from chemical graph and information theory. Unlike approaches influenced by environmental variability, detection biases, or the evolutionary constraints of genome-based chronologies, our method provides a perspective on amino acid incorporation independent of these factors. Instead of imposing a linear ranking, we derive a minimum spanning tree capturing complexity-based relationships between amino acids. The resulting hierarchy places structurally simple amino acids in basal positions, while biosynthetically complex residues appear later, aligning with existing prebiotic and genomic chronologies. Furthermore, amino acids positioned closer in the complexity space exhibit significantly greater mutational connectivity than expected by chance, suggesting that molecular complexity reflects underlying structural considerations that constrained the genetic code's evolutionary pathways. This supports the idea that the code evolved not only to maintain biochemical stability but also to facilitate complexity-preserving substitutions, ensuring smooth adaptive transitions while minimizing energetic cost differences. Additionally, molecular complexity significantly correlates with amino acid enrichment in LUCA's inferred proteome, reinforcing its role as a fundamental constraint on early protein evolution. Our approach, rooted in intrinsic molecular properties rather than external contingencies, offers new insights into the constraints shaping the genetic code and expands the scope for identifying universal principles of biochemical evolution.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding genomic function has historically relied on sequence conservation across evolutionary time. However, advances in genomics have revealed that functional innovations often arise from rapidly evolving, nonconserved elements that are frequently overlooked by conservation-based approaches. Among these, variable number tandem repeats (VNTRs) act as engines of both functional innovation and phenotypic consequence. VNTRs are repetitive genomic sequences whose copy numbers can vary significantly between individuals and species, influencing gene regulation, protein structure, and eventually, phenotypic diversity. Recent long-read assemblies and pangenomes now resolve VNTR loci accurately, enabling robust evolutionary reconstruction and functional associations. Here, we synthesize emerging insights into the functional and evolutionary impact of VNTRs in mammals. Specifically, we outline pressing questions on the mutational mechanisms driving VNTR evolution in humans, the selective forces maintaining their structural heterogeneity, and propose a theoretical framework for their persistence through evolutionary tradeoffs.
{"title":"Evolutionary Balancing of Genetic Consequence and Innovation in Mammals Through Variable Number Tandem Repeats.","authors":"Petar Pajic, Omer Gokcumen","doi":"10.1093/gbe/evaf250","DOIUrl":"10.1093/gbe/evaf250","url":null,"abstract":"<p><p>Understanding genomic function has historically relied on sequence conservation across evolutionary time. However, advances in genomics have revealed that functional innovations often arise from rapidly evolving, nonconserved elements that are frequently overlooked by conservation-based approaches. Among these, variable number tandem repeats (VNTRs) act as engines of both functional innovation and phenotypic consequence. VNTRs are repetitive genomic sequences whose copy numbers can vary significantly between individuals and species, influencing gene regulation, protein structure, and eventually, phenotypic diversity. Recent long-read assemblies and pangenomes now resolve VNTR loci accurately, enabling robust evolutionary reconstruction and functional associations. Here, we synthesize emerging insights into the functional and evolutionary impact of VNTRs in mammals. Specifically, we outline pressing questions on the mutational mechanisms driving VNTR evolution in humans, the selective forces maintaining their structural heterogeneity, and propose a theoretical framework for their persistence through evolutionary tradeoffs.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12776774/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145819172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianqing Lin, Xinrui Long, Yan Gao, Wenhua Liu, M Thomas P Gilbert
The de-extinction of species using genome-editing approaches depends on acquiring high-quality genomic information from the extinct target. However, the degraded nature of the ancient DNA (aDNA) that is typical for most extinct species, poses significant challenges to achieving comprehensive genome reconstruction. A systematic evaluation of the minimum sequencing effort that is required to reliably map the genome under varying DNA quality conditions to different reference genome remains lacking across different extinct species. Here, we systematically assess the impact of sequencing depth on genome coverage, heterozygosity estimation, and variant calling accuracy, when mapping both true aDNA data generated from the extinct Christmas Island rat (Rattus macleari), as well as in silico simulated modern- and ancient-like data generated from a modern relation (the brown rat, Rattus norvegicus), to the black rat (Rattus rattus) reference genomes. Our results demonstrate that even sequencing depths of 100× fail to yield stable heterozygosity estimates, and leave approximately 3.38% to 4.03% of its genome uncovered. These uncovered regions contained functionally relevant SNPs and indels, highlighting the limitations of reconstructing extinct genomes using reference sequences from extant relatives. Furthermore, simulations using computationally generated "degraded haploid and diploid" data based on the high-quality brown rat genome, revealed that false-positive SNPs primarily arise from insufficient coverage and low data quality, rather than aDNA damage (e.g. miscoding lesions, size of fragments, etc.) per se. These findings underscore the need to tailor sequencing depth standards by considering sample type, degradation level, and sequencing error profiles. This study provides a theoretical framework and methodological support for optimizing data strategies in aDNA research, and ultimately informing de-extinction efforts.
{"title":"Mapping the Genomic Limits of De-Extinction in the Face of Ancient DNA Degradation.","authors":"Jianqing Lin, Xinrui Long, Yan Gao, Wenhua Liu, M Thomas P Gilbert","doi":"10.1093/gbe/evaf251","DOIUrl":"10.1093/gbe/evaf251","url":null,"abstract":"<p><p>The de-extinction of species using genome-editing approaches depends on acquiring high-quality genomic information from the extinct target. However, the degraded nature of the ancient DNA (aDNA) that is typical for most extinct species, poses significant challenges to achieving comprehensive genome reconstruction. A systematic evaluation of the minimum sequencing effort that is required to reliably map the genome under varying DNA quality conditions to different reference genome remains lacking across different extinct species. Here, we systematically assess the impact of sequencing depth on genome coverage, heterozygosity estimation, and variant calling accuracy, when mapping both true aDNA data generated from the extinct Christmas Island rat (Rattus macleari), as well as in silico simulated modern- and ancient-like data generated from a modern relation (the brown rat, Rattus norvegicus), to the black rat (Rattus rattus) reference genomes. Our results demonstrate that even sequencing depths of 100× fail to yield stable heterozygosity estimates, and leave approximately 3.38% to 4.03% of its genome uncovered. These uncovered regions contained functionally relevant SNPs and indels, highlighting the limitations of reconstructing extinct genomes using reference sequences from extant relatives. Furthermore, simulations using computationally generated \"degraded haploid and diploid\" data based on the high-quality brown rat genome, revealed that false-positive SNPs primarily arise from insufficient coverage and low data quality, rather than aDNA damage (e.g. miscoding lesions, size of fragments, etc.) per se. These findings underscore the need to tailor sequencing depth standards by considering sample type, degradation level, and sequencing error profiles. This study provides a theoretical framework and methodological support for optimizing data strategies in aDNA research, and ultimately informing de-extinction efforts.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12794020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145892329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Divergence and Selection in a Cryptic Species Complex (Geonoma undata: Arecaceae) in the Northern Andes of Colombia.","authors":"","doi":"10.1093/gbe/evag006","DOIUrl":"10.1093/gbe/evag006","url":null,"abstract":"","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"18 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12815257/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}