Phylogenetic branch lengths are essential for many analyses, such as estimating divergence times, analyzing rate changes, and studying adaptation. However, true gene tree heterogeneity due to incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer can complicate the estimation of species tree branch lengths. While several tools exist for estimating the topology of a species tree addressing various causes of gene tree discordance, much less attention has been paid to branch length estimation on multi-locus datasets. For single-copy gene trees, some methods are available that summarize gene tree branch lengths onto a species tree, including coalescent-based methods that account for heterogeneity due to incomplete lineage sorting. However, no such branch length estimation method exists for multi-copy gene family trees that have evolved with gene duplication and loss. To address this gap, we introduce the CASTLES-Pro algorithm for estimating species tree branch lengths while accounting for both gene duplication and loss and incomplete lineage sorting. CASTLES-Pro improves on the existing coalescent-based branch length estimation method CASTLES by increasing its accuracy for single-copy gene trees and extending it to handle multi-copy ones. Our simulation studies show that CASTLES-Pro is generally more accurate than alternatives, eliminating the systematic bias toward overestimating terminal branch lengths often observed when using concatenation. Moreover, while not theoretically designed for horizontal gene transfer, we show that CASTLES-Pro is relatively robust to random horizontal gene transfer, though its accuracy can degrade at the highest levels of horizontal gene transfer.
{"title":"Species Tree Branch Length Estimation despite Incomplete Lineage Sorting, Duplication, and Loss.","authors":"Yasamin Tabatabaee, Chao Zhang, Shayesteh Arasti, Siavash Mirarab","doi":"10.1093/gbe/evaf200","DOIUrl":"10.1093/gbe/evaf200","url":null,"abstract":"<p><p>Phylogenetic branch lengths are essential for many analyses, such as estimating divergence times, analyzing rate changes, and studying adaptation. However, true gene tree heterogeneity due to incomplete lineage sorting, gene duplication and loss, and horizontal gene transfer can complicate the estimation of species tree branch lengths. While several tools exist for estimating the topology of a species tree addressing various causes of gene tree discordance, much less attention has been paid to branch length estimation on multi-locus datasets. For single-copy gene trees, some methods are available that summarize gene tree branch lengths onto a species tree, including coalescent-based methods that account for heterogeneity due to incomplete lineage sorting. However, no such branch length estimation method exists for multi-copy gene family trees that have evolved with gene duplication and loss. To address this gap, we introduce the CASTLES-Pro algorithm for estimating species tree branch lengths while accounting for both gene duplication and loss and incomplete lineage sorting. CASTLES-Pro improves on the existing coalescent-based branch length estimation method CASTLES by increasing its accuracy for single-copy gene trees and extending it to handle multi-copy ones. Our simulation studies show that CASTLES-Pro is generally more accurate than alternatives, eliminating the systematic bias toward overestimating terminal branch lengths often observed when using concatenation. Moreover, while not theoretically designed for horizontal gene transfer, we show that CASTLES-Pro is relatively robust to random horizontal gene transfer, though its accuracy can degrade at the highest levels of horizontal gene transfer.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"17 11","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648238/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145603891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In species where mitochondrial DNA (mtDNA) is maternally inherited such as vertebrates, mtDNA mutations harming males only are not subject to purifying selection and thus can spread in a population, especially when these mutations benefit females. Therefore, the mother's curse hypothesis (MCH) posits a greater mtDNA mutation load in males than in females. MCH is potentially important for human health, disease, and evolution, but a systematic test that considers the vast human mtDNA variation is lacking. Analyzing the genotypic and phenotypic data of approximately 0.5 million British participants in the UK Biobank, we estimate the reproductive fitness of mtDNA variants in each sex. Contradicting MCH, a positive intersexual correlation in the number of offspring exists across mitochondrial haplogroups. While a significant variation in the number of opposite-sex sexual partners-a proxy for reproductive fitness in premodern societies-is present among mitochondrial haplogroups, no significant intersexual correlation in this quantity is detected. The frequencies of a few mtDNA variants differ significantly between males and females, suggesting that these variants differentially affect the survival in the two sexes, but the number of such variants with lower male frequencies is not significantly different from that with lower female frequencies. Analysis of disease associations also finds no enrichment of male disease-associated mtDNA variants despite the discovery of multiple sex-biased disease associations. Together, these findings provide no genomic support to MCH in humans and suggest no difference in mtDNA mutation load between the two sexes that is detectable in the UK Biobank.
{"title":"Testing the Mother's Curse Hypothesis in Human Mitochondrial Genome Evolution.","authors":"Ruiqi Yuan, Jianzhi Zhang","doi":"10.1093/gbe/evaf207","DOIUrl":"10.1093/gbe/evaf207","url":null,"abstract":"<p><p>In species where mitochondrial DNA (mtDNA) is maternally inherited such as vertebrates, mtDNA mutations harming males only are not subject to purifying selection and thus can spread in a population, especially when these mutations benefit females. Therefore, the mother's curse hypothesis (MCH) posits a greater mtDNA mutation load in males than in females. MCH is potentially important for human health, disease, and evolution, but a systematic test that considers the vast human mtDNA variation is lacking. Analyzing the genotypic and phenotypic data of approximately 0.5 million British participants in the UK Biobank, we estimate the reproductive fitness of mtDNA variants in each sex. Contradicting MCH, a positive intersexual correlation in the number of offspring exists across mitochondrial haplogroups. While a significant variation in the number of opposite-sex sexual partners-a proxy for reproductive fitness in premodern societies-is present among mitochondrial haplogroups, no significant intersexual correlation in this quantity is detected. The frequencies of a few mtDNA variants differ significantly between males and females, suggesting that these variants differentially affect the survival in the two sexes, but the number of such variants with lower male frequencies is not significantly different from that with lower female frequencies. Analysis of disease associations also finds no enrichment of male disease-associated mtDNA variants despite the discovery of multiple sex-biased disease associations. Together, these findings provide no genomic support to MCH in humans and suggest no difference in mtDNA mutation load between the two sexes that is detectable in the UK Biobank.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629233/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145476974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giovanni Marturano, Diego Carli, Claudio Cucini, Elena Cardaioli, Antonio Carapelli, Federico Plazzi, Francesco Frati, Marco Passamonti, Francesco Nardi
SmithRNAs are a novel class of small noncoding RNAs that are encoded in the mitochondrial genome and regulate the expression of nuclear transcripts. They have been recently described in the Manila clam Ruditapes philippinarum and their biological function has been confirmed in vivo. It is currently unclear whether smithRNAs are a unique feature of this species, possibly related to the peculiar mechanism of sex determination observed in bivalves, or a common feature of Metazoa. Aiming at a broader survey on the presence and biological features of smithRNAs across Metazoa, 14 species were selected to represent major lineages, and for each species small RNAseq data, as well as the transcriptome, mitochondrial and nuclear genomes, were collected from the literature or sequenced/assembled de novo. Data were analyzed using the SmithHunter pipeline, a recently published tool specifically designed to identify smithRNAs and their targets. Candidate smithRNAs were identified in all species studied, supporting the notion that smithRNAs are a common feature across Metazoa. SmithRNAs are generally encoded within other genes, on the same strand, and with a preference for mitochondrial rRNAs and tRNAs. Based on their strandedness and preferential position at the 5'-end of the encompassing gene, a transcription mechanism is proposed where smithRNAs are cleaved off from gene-specific transcripts after the maturation of the two primary mitochondrial transcripts. A substantial variability was identified concerning the possible nuclear targets of smithRNAs, with a preference for regulation/response terms, mitochondrial functions, and sex/germline associated terms.
{"title":"SmithRNAs: A Common Feature among Metazoa.","authors":"Giovanni Marturano, Diego Carli, Claudio Cucini, Elena Cardaioli, Antonio Carapelli, Federico Plazzi, Francesco Frati, Marco Passamonti, Francesco Nardi","doi":"10.1093/gbe/evaf208","DOIUrl":"10.1093/gbe/evaf208","url":null,"abstract":"<p><p>SmithRNAs are a novel class of small noncoding RNAs that are encoded in the mitochondrial genome and regulate the expression of nuclear transcripts. They have been recently described in the Manila clam Ruditapes philippinarum and their biological function has been confirmed in vivo. It is currently unclear whether smithRNAs are a unique feature of this species, possibly related to the peculiar mechanism of sex determination observed in bivalves, or a common feature of Metazoa. Aiming at a broader survey on the presence and biological features of smithRNAs across Metazoa, 14 species were selected to represent major lineages, and for each species small RNAseq data, as well as the transcriptome, mitochondrial and nuclear genomes, were collected from the literature or sequenced/assembled de novo. Data were analyzed using the SmithHunter pipeline, a recently published tool specifically designed to identify smithRNAs and their targets. Candidate smithRNAs were identified in all species studied, supporting the notion that smithRNAs are a common feature across Metazoa. SmithRNAs are generally encoded within other genes, on the same strand, and with a preference for mitochondrial rRNAs and tRNAs. Based on their strandedness and preferential position at the 5'-end of the encompassing gene, a transcription mechanism is proposed where smithRNAs are cleaved off from gene-specific transcripts after the maturation of the two primary mitochondrial transcripts. A substantial variability was identified concerning the possible nuclear targets of smithRNAs, with a preference for regulation/response terms, mitochondrial functions, and sex/germline associated terms.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12628786/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145476893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T Brock Wooldridge, Merly Escalona, Blair W Perry, Alexis N Enstrom, Dalya Salih, William E Seligmann, Samuel Sacco, Katherine L Moon, Ruta Sahasrabudhe, Noravit Chumchim, Oanh Nguyen, Joanna L Kelley, Ross D E MacPhee, Beth Shapiro
Reconstructions of evolutionary history can be restricted by a lack of high-quality reference genomes. To date, only four of the eight species of bears (family Ursidae) have chromosome-level genome assemblies. Here, we present assemblies for three additional species-the sun, sloth, and Andean bears-and use a whole-genome alignment of all bear species and other carnivores to reconstruct the evolution of Ursidae. Multiple divergence dating approaches suggest that the six Ursine bears likely diversified in the last 5 Ma, but that divergence times within Ursinae are significantly impacted by gene tree heterogeneity. Consistent with this, we observe that nearly 50% of gene trees conflict with our highly supported species tree, a pattern driven by a significant early hybridization event within Ursinae. We also find that the karyotype of Ursinae is largely similar to the ancestral karyotype of all bears twenty million years prior. In contrast to this conservation of structure, dozens of chromosomal fissions and fusions associated with LINE/L1 retrotransposons dramatically restructured the genomes of the giant panda and Andean bear. Finally, we leverage these genomes to identify species-specific evidence for positive selection on genes associated with color, diet, and metabolism. One of these genes, TCPN2, has a role in pigmentation and shows a series of amino acid mutations in the polar bear over the last 0.5 Ma. Collectively, these new genomic resources enable improved reconstruction of the complex evolutionary history of bears and clarify how this enigmatic group diversified.
{"title":"Chromosome-scale Genomes Show Rapid Diversification and Ancient Gene Flow Among Bear Species.","authors":"T Brock Wooldridge, Merly Escalona, Blair W Perry, Alexis N Enstrom, Dalya Salih, William E Seligmann, Samuel Sacco, Katherine L Moon, Ruta Sahasrabudhe, Noravit Chumchim, Oanh Nguyen, Joanna L Kelley, Ross D E MacPhee, Beth Shapiro","doi":"10.1093/gbe/evaf188","DOIUrl":"10.1093/gbe/evaf188","url":null,"abstract":"<p><p>Reconstructions of evolutionary history can be restricted by a lack of high-quality reference genomes. To date, only four of the eight species of bears (family Ursidae) have chromosome-level genome assemblies. Here, we present assemblies for three additional species-the sun, sloth, and Andean bears-and use a whole-genome alignment of all bear species and other carnivores to reconstruct the evolution of Ursidae. Multiple divergence dating approaches suggest that the six Ursine bears likely diversified in the last 5 Ma, but that divergence times within Ursinae are significantly impacted by gene tree heterogeneity. Consistent with this, we observe that nearly 50% of gene trees conflict with our highly supported species tree, a pattern driven by a significant early hybridization event within Ursinae. We also find that the karyotype of Ursinae is largely similar to the ancestral karyotype of all bears twenty million years prior. In contrast to this conservation of structure, dozens of chromosomal fissions and fusions associated with LINE/L1 retrotransposons dramatically restructured the genomes of the giant panda and Andean bear. Finally, we leverage these genomes to identify species-specific evidence for positive selection on genes associated with color, diet, and metabolism. One of these genes, TCPN2, has a role in pigmentation and shows a series of amino acid mutations in the polar bear over the last 0.5 Ma. Collectively, these new genomic resources enable improved reconstruction of the complex evolutionary history of bears and clarify how this enigmatic group diversified.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12631120/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kristen A Behrens, Zexuan Zhao, Michael R Kidd, Alfred Maluwa, Avner Cnaani, Stephan Koblmüller, Thomas D Kocher
Cichlid fishes have undergone an extraordinary diversification in East Africa. They also have a high rate of sex chromosome turnover. This clade provides an opportunity to study the rates and patterns of sex chromosome turnover, and the interactions of sex chromosome turnover with adaptation and speciation. Here we investigate the evolution sex chromosomes in the tribes Tilapiini, Coptodonini, Heterotilapiini, Gobiocichlini, Pelmatolapiini and Oreochromini. We assembled chromosome-scale genomes of male and female Pelmatotilapia mariae. We then mapped pooled sequencing reads for males and females of P. mariae and 12 additional species on several genome assemblies to identify sex chromosomes. Tilapia sparrmanii and Oreochromis aureus share a ZW system on LG3 that overlaps the ZW system identified in P. mariae. Heterotilapia buettikoferi, T. brevimanus and Coptodon bakossiorum share an XY system mapping to another region of LG3. Coptodon zilli, Sarotherodon galilaeus, S. melanotheron and O. niloticus share an XY system on LG1. Finally, O. mossambicus and O. shiranus share an XY system on LG14 and we find evidence of an XY system on LG20 in Danakilia sp. 'shukoray'. The phylogenetic distribution of these sex determination systems suggests a long period of polymorphism for the systems on LG1 and LG3 and a generally lower rate of sex chromosome turnover in these lineages compared to the lacustrine lineages of the East African radiation. Our data is not consistent with the recent suggestion of figla and banf2 as candidate genes for the LG1XY and LG3ZW systems. We suggest a possible role for ubiquitination in the XY systems on LG3.
{"title":"Before the East African radiation: sex chromosome systems in basal haplotilapiine cichlids.","authors":"Kristen A Behrens, Zexuan Zhao, Michael R Kidd, Alfred Maluwa, Avner Cnaani, Stephan Koblmüller, Thomas D Kocher","doi":"10.1093/gbe/evaf191","DOIUrl":"https://doi.org/10.1093/gbe/evaf191","url":null,"abstract":"<p><p>Cichlid fishes have undergone an extraordinary diversification in East Africa. They also have a high rate of sex chromosome turnover. This clade provides an opportunity to study the rates and patterns of sex chromosome turnover, and the interactions of sex chromosome turnover with adaptation and speciation. Here we investigate the evolution sex chromosomes in the tribes Tilapiini, Coptodonini, Heterotilapiini, Gobiocichlini, Pelmatolapiini and Oreochromini. We assembled chromosome-scale genomes of male and female Pelmatotilapia mariae. We then mapped pooled sequencing reads for males and females of P. mariae and 12 additional species on several genome assemblies to identify sex chromosomes. Tilapia sparrmanii and Oreochromis aureus share a ZW system on LG3 that overlaps the ZW system identified in P. mariae. Heterotilapia buettikoferi, T. brevimanus and Coptodon bakossiorum share an XY system mapping to another region of LG3. Coptodon zilli, Sarotherodon galilaeus, S. melanotheron and O. niloticus share an XY system on LG1. Finally, O. mossambicus and O. shiranus share an XY system on LG14 and we find evidence of an XY system on LG20 in Danakilia sp. 'shukoray'. The phylogenetic distribution of these sex determination systems suggests a long period of polymorphism for the systems on LG1 and LG3 and a generally lower rate of sex chromosome turnover in these lineages compared to the lacustrine lineages of the East African radiation. Our data is not consistent with the recent suggestion of figla and banf2 as candidate genes for the LG1XY and LG3ZW systems. We suggest a possible role for ubiquitination in the XY systems on LG3.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145250890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joris Mordier, Marine Fraisse, Michel Cohen-Tannoudji, Antoine Molaro
SCHLAFEN proteins are a large family of RNase-related enzymes carrying essential immune and developmental functions. Despite these important roles, Schlafen genes display varying degrees of evolutionary conservation in mammals. While this appears to influence their molecular activities, a detailed understanding of these evolutionary innovations is still lacking. Here, we used in-depth phylogenomic approaches to characterize the evolutionary trajectories and selective forces shaping mammalian Schlafen genes. We traced lineage-specific Schlafen amplifications and found that recent duplicates evolved under distinct selective forces, supporting repeated subfunctionalization cycles. Codon-level natural selection analyses in primates and rodents identified recurrent positive selection over Schlafen protein domains engaged in viral interactions. Combining known crystal structures and predicted protein structures, we discovered a novel class of rapidly evolving residues enriched at the contact interface of SCHLAFEN protein dimers. Our results suggest that inter-SCHLAFEN compatibilities are under strong selective pressures and are likely to impact their molecular functions. We posit that cycles of genetic conflicts with pathogens and between paralogs drove Schlafens' recurrent evolutionary innovations in mammals.
{"title":"Recurrent Evolutionary Innovations in Rodent and Primate Schlafen Genes.","authors":"Joris Mordier, Marine Fraisse, Michel Cohen-Tannoudji, Antoine Molaro","doi":"10.1093/gbe/evaf172","DOIUrl":"10.1093/gbe/evaf172","url":null,"abstract":"<p><p>SCHLAFEN proteins are a large family of RNase-related enzymes carrying essential immune and developmental functions. Despite these important roles, Schlafen genes display varying degrees of evolutionary conservation in mammals. While this appears to influence their molecular activities, a detailed understanding of these evolutionary innovations is still lacking. Here, we used in-depth phylogenomic approaches to characterize the evolutionary trajectories and selective forces shaping mammalian Schlafen genes. We traced lineage-specific Schlafen amplifications and found that recent duplicates evolved under distinct selective forces, supporting repeated subfunctionalization cycles. Codon-level natural selection analyses in primates and rodents identified recurrent positive selection over Schlafen protein domains engaged in viral interactions. Combining known crystal structures and predicted protein structures, we discovered a novel class of rapidly evolving residues enriched at the contact interface of SCHLAFEN protein dimers. Our results suggest that inter-SCHLAFEN compatibilities are under strong selective pressures and are likely to impact their molecular functions. We posit that cycles of genetic conflicts with pathogens and between paralogs drove Schlafens' recurrent evolutionary innovations in mammals.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12492265/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145091368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guy Leonard, Benjamin H Jenkins, Fiona R Savory, Estelle S Kilias, Finlay Maguire, David S Milner, Thomas A Richards
How two species engage in stable endosymbiosis is a biological quandary. The study of facultative endosymbiotic interactions has emerged as a useful approach to understand how endosymbiotic functions can arise. The ciliate protist Paramecium bursaria hosts green algae of the order Chlorellales in a facultative photo-endosymbiosis. We have recently reported RNAi as a tool for understanding gene function in P. bursaria 186b (CCAP strain 1660/18). To complement this work, here we report a near complete host genome and transcriptome sequence dataset, using both Illumina and PacBio sequencing methods, in order to aid genome analysis and to enable the design of RNAi experiments. Our analyses demonstrate P. bursaria 186b, like other ciliates such as diverse species of Paramecia, possess numerous tiny introns. These data patterns, combined with the alternative genetic code common to ciliates, make gene identification and annotation challenging; as such, we identify gene models using Iso-Seq methodologies. These data will aid the investigation of genome evolution in the Paramecia and provide additional source data for the exploration of endosymbiotic functions.
{"title":"De Novo Genome Sequence Assembly of the RNAi-Tractable Paramecium bursaria 186b: An Endosymbiotic Model System.","authors":"Guy Leonard, Benjamin H Jenkins, Fiona R Savory, Estelle S Kilias, Finlay Maguire, David S Milner, Thomas A Richards","doi":"10.1093/gbe/evaf183","DOIUrl":"10.1093/gbe/evaf183","url":null,"abstract":"<p><p>How two species engage in stable endosymbiosis is a biological quandary. The study of facultative endosymbiotic interactions has emerged as a useful approach to understand how endosymbiotic functions can arise. The ciliate protist Paramecium bursaria hosts green algae of the order Chlorellales in a facultative photo-endosymbiosis. We have recently reported RNAi as a tool for understanding gene function in P. bursaria 186b (CCAP strain 1660/18). To complement this work, here we report a near complete host genome and transcriptome sequence dataset, using both Illumina and PacBio sequencing methods, in order to aid genome analysis and to enable the design of RNAi experiments. Our analyses demonstrate P. bursaria 186b, like other ciliates such as diverse species of Paramecia, possess numerous tiny introns. These data patterns, combined with the alternative genetic code common to ciliates, make gene identification and annotation challenging; as such, we identify gene models using Iso-Seq methodologies. These data will aid the investigation of genome evolution in the Paramecia and provide additional source data for the exploration of endosymbiotic functions.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569599/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145148740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kevin Rychel, Ke Chen, Edward A Catoiu, Elina Olson, Troy E Sandberg, Ye Gao, Sibei Xu, Ying Hefner, Richard Szubin, Arjun Patel, Adam M Feist, Bernhard O Palsson
Adaptive laboratory evolution is able to generate microbial strains, which exhibit extreme phenotypes, revealing fundamental biological adaptation mechanisms. Here, we use adaptive laboratory evolution to evolve Escherichia coli strains that grow at temperatures as high as 45.3 °C, a temperature lethal to wild-type cells. The strains adopted a hypermutator phenotype and employed multiple systems-level adaptations that made global analysis of the DNA mutations difficult. Given the challenge at the genomic level, we were motivated to uncover high-temperature tolerance adaptation mechanisms at the transcriptomic level. We employed independently modulated gene set (iModulon) analysis to reveal five transcriptional mechanisms underlying growth at high temperatures. These mechanisms were connected to acquired mutations, changes in transcriptome composition, sensory inputs, phenotypes, and protein structures. They are as follows: (i) downregulation of general stress responses while upregulating the specific heat stress responses, (ii) upregulation of flagellar basal bodies without upregulating motility and upregulation fimbriae, (iii) shift toward anaerobic metabolism, (iv) shift in regulation of iron uptake away from siderophore production, and (v) upregulation of yjfIJKL, a novel heat tolerance operon whose structures we predicted with AlphaFold. iModulons associated with these five mechanisms explain nearly half of all variance in the gene expression in the adapted strains. These thermotolerance strategies reveal that optimal coordination of known stress responses and metabolism can be achieved with a small number of regulatory mutations and may suggest a new role for large protein export systems. Adaptive laboratory evolution with transcriptomic characterization is a productive approach for elucidating and interpreting adaptation to otherwise lethal stresses.
{"title":"Laboratory Evolution Reveals Transcriptional Mechanisms Underlying Thermal Adaptation of Escherichia coli.","authors":"Kevin Rychel, Ke Chen, Edward A Catoiu, Elina Olson, Troy E Sandberg, Ye Gao, Sibei Xu, Ying Hefner, Richard Szubin, Arjun Patel, Adam M Feist, Bernhard O Palsson","doi":"10.1093/gbe/evaf171","DOIUrl":"10.1093/gbe/evaf171","url":null,"abstract":"<p><p>Adaptive laboratory evolution is able to generate microbial strains, which exhibit extreme phenotypes, revealing fundamental biological adaptation mechanisms. Here, we use adaptive laboratory evolution to evolve Escherichia coli strains that grow at temperatures as high as 45.3 °C, a temperature lethal to wild-type cells. The strains adopted a hypermutator phenotype and employed multiple systems-level adaptations that made global analysis of the DNA mutations difficult. Given the challenge at the genomic level, we were motivated to uncover high-temperature tolerance adaptation mechanisms at the transcriptomic level. We employed independently modulated gene set (iModulon) analysis to reveal five transcriptional mechanisms underlying growth at high temperatures. These mechanisms were connected to acquired mutations, changes in transcriptome composition, sensory inputs, phenotypes, and protein structures. They are as follows: (i) downregulation of general stress responses while upregulating the specific heat stress responses, (ii) upregulation of flagellar basal bodies without upregulating motility and upregulation fimbriae, (iii) shift toward anaerobic metabolism, (iv) shift in regulation of iron uptake away from siderophore production, and (v) upregulation of yjfIJKL, a novel heat tolerance operon whose structures we predicted with AlphaFold. iModulons associated with these five mechanisms explain nearly half of all variance in the gene expression in the adapted strains. These thermotolerance strategies reveal that optimal coordination of known stress responses and metabolism can be achieved with a small number of regulatory mutations and may suggest a new role for large protein export systems. Adaptive laboratory evolution with transcriptomic characterization is a productive approach for elucidating and interpreting adaptation to otherwise lethal stresses.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"17 10","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12492005/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145212504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jamie D Dixson, Abhijay Azad, Pamela A Padilla, Rajeev K Azad
Cytochrome P450s are a superfamily of heme-binding monooxygenases involved with the detoxification of intrinsic and extrinsic toxins. They are near ubiquitous within biological domains and are found in all domains. Members of families within the superfamily are defined based on amino acid identity thresholds, with thresholds as low as 40% in some families. Relationships among Cytochrome P450 families have proven elusive due to sub-Twilight Zone interfamily identities (<30%) that result in poor multiple sequence alignment quality and thus low levels of support for downstream phylogenetic reconstructions. Despite the low identities, Cytochrome P450 structures are remarkably well conserved both within and among families. In such cases, structural phylogenetics has the potential to unveil elusive relationships because the selectively favored physicochemical properties giving rise to the structure and function of the proteins persist despite sequence-level divergence. Recently, in two separate publications, we demonstrated that by utilizing physicochemical vectors, dynamic time warping, and hierarchical clustering (PCDTW), large swaths of protein domain families and betacoronavirus receptor-binding domain clades were congruent with validated functional/structural relationships. These were important findings because anomalous sequence alignment-based maximum likelihood phylogenetic findings, which were not congruent with the known functional relationships, were resolved. That also validated the use of physicochemical vectors in making inferences about structural/functional homology. Additionally, it illuminated that the same methods might be applied to other protein families with relationships that are difficult to resolve from sequence data alone. Herein, we used Molecular Weight and Hydrophobicity Physicochemical Dynamic Time Warping (MWHP PCDTW) along with structural and sequence alignment-based phylogenetic methodologies to analyze all of the Cytochrome P450s found both in the high-fidelity Structural Classificaction of Proteins (SCOP) database and the reviewed sequences with both experimentally resolved and de novo predicted structures in the Protein Data Bank and the AlphaFold (AF) Protein Structure Database, respectively. We compared the resulting phylogenetic topologies and found that in some cases, structure-based methods may be less able to resolve random/convergent similarity than physicochemical and sequence-based methodologies. This finding agrees with previous findings that demonstrate the usefulness of physicochemical properties in resolving both random structural similarity and potentially convergent relationships.
{"title":"Inference of Cytochrome P450 Evolutionary History Using Structural and Physicochemical Metrics.","authors":"Jamie D Dixson, Abhijay Azad, Pamela A Padilla, Rajeev K Azad","doi":"10.1093/gbe/evaf178","DOIUrl":"10.1093/gbe/evaf178","url":null,"abstract":"<p><p>Cytochrome P450s are a superfamily of heme-binding monooxygenases involved with the detoxification of intrinsic and extrinsic toxins. They are near ubiquitous within biological domains and are found in all domains. Members of families within the superfamily are defined based on amino acid identity thresholds, with thresholds as low as 40% in some families. Relationships among Cytochrome P450 families have proven elusive due to sub-Twilight Zone interfamily identities (<30%) that result in poor multiple sequence alignment quality and thus low levels of support for downstream phylogenetic reconstructions. Despite the low identities, Cytochrome P450 structures are remarkably well conserved both within and among families. In such cases, structural phylogenetics has the potential to unveil elusive relationships because the selectively favored physicochemical properties giving rise to the structure and function of the proteins persist despite sequence-level divergence. Recently, in two separate publications, we demonstrated that by utilizing physicochemical vectors, dynamic time warping, and hierarchical clustering (PCDTW), large swaths of protein domain families and betacoronavirus receptor-binding domain clades were congruent with validated functional/structural relationships. These were important findings because anomalous sequence alignment-based maximum likelihood phylogenetic findings, which were not congruent with the known functional relationships, were resolved. That also validated the use of physicochemical vectors in making inferences about structural/functional homology. Additionally, it illuminated that the same methods might be applied to other protein families with relationships that are difficult to resolve from sequence data alone. Herein, we used Molecular Weight and Hydrophobicity Physicochemical Dynamic Time Warping (MWHP PCDTW) along with structural and sequence alignment-based phylogenetic methodologies to analyze all of the Cytochrome P450s found both in the high-fidelity Structural Classificaction of Proteins (SCOP) database and the reviewed sequences with both experimentally resolved and de novo predicted structures in the Protein Data Bank and the AlphaFold (AF) Protein Structure Database, respectively. We compared the resulting phylogenetic topologies and found that in some cases, structure-based methods may be less able to resolve random/convergent similarity than physicochemical and sequence-based methodologies. This finding agrees with previous findings that demonstrate the usefulness of physicochemical properties in resolving both random structural similarity and potentially convergent relationships.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12502919/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145091130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Structural variation makes an important contribution to canine evolution and phenotypic differences. Although recent advances in long-read sequencing have enabled the generation of multiple canine genome assemblies, most prior analyses of structural variation have relied on short-read sequencing. To offer a more complete assessment of structural variation in canines, we performed an integrative analysis of structural variants present in 12 canine samples with available long-read and short-read sequencing data along with genome assemblies. Use of long-reads permits the discovery of heterozygous variation that is absent in existing haploid assembly representations while offering a marked increase in the ability to identify insertion variants relative to short-read approaches. Examination of the size spectrum of structural variants shows that dimorphic LINE-1 and SINE variants account for over 45% of all deletions and identified 1,410 LINE-1s with intact open reading frames that show presence-absence dimorphism. Using a graph-based approach, we genotype newly discovered structural variants in an existing collection of 1,879 resequenced dogs and wolves, generating a variant catalog containing a 56.5% increase in the number of deletions and 705% increase in the number of insertions previously found in the analyzed samples. Examination of allele frequencies across admixture components present across breed clades identified 283 structural variants evolving with a signature of selection.
{"title":"Integrative Genotyping and Analysis of Canine Structural Variation Using Long-read and Short-read Data.","authors":"Peter Z Schall, Jeffrey M Kidd","doi":"10.1093/gbe/evaf173","DOIUrl":"10.1093/gbe/evaf173","url":null,"abstract":"<p><p>Structural variation makes an important contribution to canine evolution and phenotypic differences. Although recent advances in long-read sequencing have enabled the generation of multiple canine genome assemblies, most prior analyses of structural variation have relied on short-read sequencing. To offer a more complete assessment of structural variation in canines, we performed an integrative analysis of structural variants present in 12 canine samples with available long-read and short-read sequencing data along with genome assemblies. Use of long-reads permits the discovery of heterozygous variation that is absent in existing haploid assembly representations while offering a marked increase in the ability to identify insertion variants relative to short-read approaches. Examination of the size spectrum of structural variants shows that dimorphic LINE-1 and SINE variants account for over 45% of all deletions and identified 1,410 LINE-1s with intact open reading frames that show presence-absence dimorphism. Using a graph-based approach, we genotype newly discovered structural variants in an existing collection of 1,879 resequenced dogs and wolves, generating a variant catalog containing a 56.5% increase in the number of deletions and 705% increase in the number of insertions previously found in the analyzed samples. Examination of allele frequencies across admixture components present across breed clades identified 283 structural variants evolving with a signature of selection.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12481690/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145091397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}