Pub Date : 2025-06-01Epub Date: 2025-05-20DOI: 10.1007/s00239-025-10252-w
Leopold Eckhart, Attila Placido Sachslehner, Julia Steinbinder, Heinz Fischer
Caspases are cysteine-dependent aspartate-directed proteases which have critical functions in programmed cell death and inflammation. Their catalytic activity depends on a catalytic dyad of cysteine and histidine within a characteristic protein fold, the so-called caspase domain. Here, we investigated the evolution of caspase-16 (CASP16), an enigmatic member of the caspase family, for which only a partial human gene had been reported previously. The presence of CASP16 orthologs in placental mammals, marsupials and monotremes suggests that caspase-16 originated prior to the divergence of the main phylogenetic clades of mammals. Caspase-16 proteins of various species contain a carboxy-terminal caspase domain and an amino-terminal prodomain predicted to fold into a caspase domain-like structure, which is a unique feature among caspases known so far. Comparative sequence analysis indicates that the prodomain of caspase-16 has evolved by the duplication of exons encoding the caspase domain, whereby the catalytic site was lost in the amino-terminal domain and conserved in the carboxy-terminal domain of caspase-16. The murine and human orthologs of CASP16 contain frameshift mutations and therefore represent pseudogenes (CASP16P). CASP16 of the chimpanzee displays more than 98% nucleotide sequence identity with the human CASP16P gene but, like CASP16 genes of other primates, has an intact protein coding sequence. We conclude that caspase-16 structurally differs from other mammalian caspases, and the pseudogenization of CASP16 distinguishes humans from their phylogenetically closest relatives.
{"title":"Caspase Domain Duplication During the Evolution of Caspase-16.","authors":"Leopold Eckhart, Attila Placido Sachslehner, Julia Steinbinder, Heinz Fischer","doi":"10.1007/s00239-025-10252-w","DOIUrl":"10.1007/s00239-025-10252-w","url":null,"abstract":"<p><p>Caspases are cysteine-dependent aspartate-directed proteases which have critical functions in programmed cell death and inflammation. Their catalytic activity depends on a catalytic dyad of cysteine and histidine within a characteristic protein fold, the so-called caspase domain. Here, we investigated the evolution of caspase-16 (CASP16), an enigmatic member of the caspase family, for which only a partial human gene had been reported previously. The presence of CASP16 orthologs in placental mammals, marsupials and monotremes suggests that caspase-16 originated prior to the divergence of the main phylogenetic clades of mammals. Caspase-16 proteins of various species contain a carboxy-terminal caspase domain and an amino-terminal prodomain predicted to fold into a caspase domain-like structure, which is a unique feature among caspases known so far. Comparative sequence analysis indicates that the prodomain of caspase-16 has evolved by the duplication of exons encoding the caspase domain, whereby the catalytic site was lost in the amino-terminal domain and conserved in the carboxy-terminal domain of caspase-16. The murine and human orthologs of CASP16 contain frameshift mutations and therefore represent pseudogenes (CASP16P). CASP16 of the chimpanzee displays more than 98% nucleotide sequence identity with the human CASP16P gene but, like CASP16 genes of other primates, has an intact protein coding sequence. We conclude that caspase-16 structurally differs from other mammalian caspases, and the pseudogenization of CASP16 distinguishes humans from their phylogenetically closest relatives.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"395-405"},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198278/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144110100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-30DOI: 10.1007/s00239-025-10254-8
E J Huang, Jeeun Parksong, Amy F Peterson, Fernando Torres, Sergi Regot, Gabriel S Bever
The evolutionary origins of the three-tier mitogen-activated protein kinase (MAPK) signaling network remain poorly understood despite its indispensable role in eukaryote physiology. Here, we develop a novel two-step method combining relaxed ortholog candidate search with iterative phylogenetic evaluation to identify orthologs across critical eukaryotic lineages. We perform a comprehensive phylogenetic analysis to delineate the history of divergence for non-human orthologs of human paralogs along the human evolutionary backbone. Our detailed evolutionary trees of MAPKs, MAP2Ks, and MAP3Ks reveal two major pulses of coevolutionary tandem expansion: one predating the divergence of fungi and animals, and the other predating the origin of animals. Our reconstruction also infers a polyphyletic origin for the atypical MAPKs. Integrating functional literature across eukaryotic taxa with our trees reveals that the two clades of MAP3K, Sterile-like (STE) and tyrosine kinase-like (TKL), had distinct trajectories and influences on downstream pathway diversification. STEs that function as MAP3Ks are conserved across extant eukaryotes. While TKL MAP3Ks are absent in many early diverging eukaryotes, their expansion aligns phylogenetically and functionally with that of downstream MAP2Ks and MAPKs. We propose that the MAPK network originated as a STE MAP3K-regulated pathway, but subsequent recruitment and radiations of TKL MAP3Ks drove downstream diversification in parallel, manifesting in top-down finetuning of pathway specificity. Our study provides an evolutionary framework for the functional diversity of this complex signaling network, demonstrating that phylogenetic insights can generate new hypotheses to understand fundamental cellular processes.
{"title":"Refined Phylogenetic Ortholog Inference Reveals Coevolutionary Expansion of the MAPK Signaling Network Through Finetuning of Pathway Specificity.","authors":"E J Huang, Jeeun Parksong, Amy F Peterson, Fernando Torres, Sergi Regot, Gabriel S Bever","doi":"10.1007/s00239-025-10254-8","DOIUrl":"10.1007/s00239-025-10254-8","url":null,"abstract":"<p><p>The evolutionary origins of the three-tier mitogen-activated protein kinase (MAPK) signaling network remain poorly understood despite its indispensable role in eukaryote physiology. Here, we develop a novel two-step method combining relaxed ortholog candidate search with iterative phylogenetic evaluation to identify orthologs across critical eukaryotic lineages. We perform a comprehensive phylogenetic analysis to delineate the history of divergence for non-human orthologs of human paralogs along the human evolutionary backbone. Our detailed evolutionary trees of MAPKs, MAP2Ks, and MAP3Ks reveal two major pulses of coevolutionary tandem expansion: one predating the divergence of fungi and animals, and the other predating the origin of animals. Our reconstruction also infers a polyphyletic origin for the atypical MAPKs. Integrating functional literature across eukaryotic taxa with our trees reveals that the two clades of MAP3K, Sterile-like (STE) and tyrosine kinase-like (TKL), had distinct trajectories and influences on downstream pathway diversification. STEs that function as MAP3Ks are conserved across extant eukaryotes. While TKL MAP3Ks are absent in many early diverging eukaryotes, their expansion aligns phylogenetically and functionally with that of downstream MAP2Ks and MAPKs. We propose that the MAPK network originated as a STE MAP3K-regulated pathway, but subsequent recruitment and radiations of TKL MAP3Ks drove downstream diversification in parallel, manifesting in top-down finetuning of pathway specificity. Our study provides an evolutionary framework for the functional diversity of this complex signaling network, demonstrating that phylogenetic insights can generate new hypotheses to understand fundamental cellular processes.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"423-440"},"PeriodicalIF":1.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144187163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-20DOI: 10.1007/s00239-025-10251-x
Lucio Aliperti Car, Ignacio E Sánchez
Encoding of protein-coding sequences in a genome through evolution leads to characteristic proportions of codons and amino acids. Here, we present a simplified maximum entropy model that groups together codons with the same GC (guanine + cytosine) content and coding for the same amino acid and accounts for the stoichiometry of genetic elements in over 50000 genomes with seven interpretable parameters. Our model includes both the cost of a codon given a genomic GC content and the metabolic cost of the corresponding amino acid. Both costs are essential for accurate prediction of codon and amino acid abundances. The best implementation of the model includes a universal equilibrium value for the genomic GC content below 50%, as suggested by the literature. It also splits the twenty amino acids in two groups forming strong (bases C and G) or weak (bases A and U) Watson Crick base pairs with the anticodon, differing in the strength of GC-dependent selection. The entropy-cost trade-off suggests that each organism has sorted out the genome encoding problem given a value for its genomic GC content. The empirical boundaries to this trade-off suggest minimal values for the amino acid and codon entropies, which may limit the GC content of natural genomes.
{"title":"Genomic AT Bias Coupled with Amino Acid Metabolism Modulates Codon Usage.","authors":"Lucio Aliperti Car, Ignacio E Sánchez","doi":"10.1007/s00239-025-10251-x","DOIUrl":"10.1007/s00239-025-10251-x","url":null,"abstract":"<p><p>Encoding of protein-coding sequences in a genome through evolution leads to characteristic proportions of codons and amino acids. Here, we present a simplified maximum entropy model that groups together codons with the same GC (guanine + cytosine) content and coding for the same amino acid and accounts for the stoichiometry of genetic elements in over 50000 genomes with seven interpretable parameters. Our model includes both the cost of a codon given a genomic GC content and the metabolic cost of the corresponding amino acid. Both costs are essential for accurate prediction of codon and amino acid abundances. The best implementation of the model includes a universal equilibrium value for the genomic GC content below 50%, as suggested by the literature. It also splits the twenty amino acids in two groups forming strong (bases C and G) or weak (bases A and U) Watson Crick base pairs with the anticodon, differing in the strength of GC-dependent selection. The entropy-cost trade-off suggests that each organism has sorted out the genome encoding problem given a value for its genomic GC content. The empirical boundaries to this trade-off suggest minimal values for the amino acid and codon entropies, which may limit the GC content of natural genomes.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"379-394"},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144110410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It has been observed that five members of Secreted Frizzled-Related proteins act as antagonists for the Wnt signaling pathway in humans. These glycoproteins have two functional domains: the cysteine-rich domain (CRD) and the netrin-related domain (NTR), with a completely conserved disulfide bond in the CRD domain. Phylogenetic analysis revealed that this protein family can be divided into two subgroups, SFRP1/SFRP2/SFRP5 versus SFRP3/SFRP4. The SFRP3/SFRP4 group was found to be more closely related to the sponge Lubomirskia baicalensis, which is believed to represent the ancient origin of SFRPs. The model evaluation demonstrated high-quality conformational homology modeling in the predicted Human SFRP models compared to the Sizzled crystal structure of Xenopus laevis. The molecular dynamic simulation illustrated that SFRP1 and SFRP2 exhibit the most stable structures during 100 ns of simulation. Multiple sequence alignment and conservation analysis of Human SFRPs showed that the CRD domain of SFRPs is more conserved than the NTR domain. The docking result indicated that SFRP3 has the highest binding affinity to Wnt3, while SFRP1 and SFRP5 have the lowest. Despite the lower affinity of SFRP1/SFRP5 for Wnt3, a higher positive charge in their NTR domains leads to an increase in their local concentration near the secreting cells and an enhancement in the antagonistic activity. In contrast, SFRP3/SFRP4 can act as an antagonist in distant cells due to less positive regions in their NTR domain and weakly binding to the heparin of the intercellular matrix.
{"title":"Evolutionary and Structural Assessment of the Human Secreted Frizzled-Related Protein (SFRP) Family.","authors":"Ladan Mafakher, Elham Rismani, Ladan Teimoori-Toolabi","doi":"10.1007/s00239-025-10249-5","DOIUrl":"10.1007/s00239-025-10249-5","url":null,"abstract":"<p><p>It has been observed that five members of Secreted Frizzled-Related proteins act as antagonists for the Wnt signaling pathway in humans. These glycoproteins have two functional domains: the cysteine-rich domain (CRD) and the netrin-related domain (NTR), with a completely conserved disulfide bond in the CRD domain. Phylogenetic analysis revealed that this protein family can be divided into two subgroups, SFRP1/SFRP2/SFRP5 versus SFRP3/SFRP4. The SFRP3/SFRP4 group was found to be more closely related to the sponge Lubomirskia baicalensis, which is believed to represent the ancient origin of SFRPs. The model evaluation demonstrated high-quality conformational homology modeling in the predicted Human SFRP models compared to the Sizzled crystal structure of Xenopus laevis. The molecular dynamic simulation illustrated that SFRP1 and SFRP2 exhibit the most stable structures during 100 ns of simulation. Multiple sequence alignment and conservation analysis of Human SFRPs showed that the CRD domain of SFRPs is more conserved than the NTR domain. The docking result indicated that SFRP3 has the highest binding affinity to Wnt3, while SFRP1 and SFRP5 have the lowest. Despite the lower affinity of SFRP1/SFRP5 for Wnt3, a higher positive charge in their NTR domains leads to an increase in their local concentration near the secreting cells and an enhancement in the antagonistic activity. In contrast, SFRP3/SFRP4 can act as an antagonist in distant cells due to less positive regions in their NTR domain and weakly binding to the heparin of the intercellular matrix.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"350-369"},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144078213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-04-27DOI: 10.1007/s00239-025-10247-7
Akira Hanashima, Yuu Usui, Ken Hashimoto, Satoshi Mohri
The emergence of connectin, also called titin, a muscular spring and the largest protein in living organisms, is critical in metazoan evolution as it enables striated muscle-based locomotion. However, its evolutionary history remains unclear. Here, we investigated the evolutionary process using genomic analysis and deduced the ancestor of connectin. The region between the HOX and WNT clusters in the human genome, where the connectin gene (CON (TTN)) is located, was quadrupled by two rounds of whole-genome duplication (WGD) in the ancestor of jawed vertebrates. However, connectin ohnologs were deleted before the advent of jawed vertebrates, resulting in a single connectin gene. Additionally, one of the connectin ohnologs created by the third round of teleost WGD disappeared, while the other was duplicated on the same chromosome. We also discovered that the connectin and connectin family genes consistently underwent local duplication on the same chromosome, though the underlying mechanism remains unknown. Using synteny analysis, we identified KALRN and its ohnolog TRIO as putative ancestral paralogs of the connectin gene. TRIO/KALRN has a connected structure of SESTD1-CCDC141-CON (TTN), and its synteny is conserved in the Drosophila genome. Furthermore, we confirmed that this connected structure, termed 'connectitin,' (connected-connectin/titin) is conserved in cnidarians and placozoans. Molecular phylogenetic analysis revealed that it diverged from TRIO/KALRN prior to the emergence of these animals, suggesting that metazoan muscle may have a single origin. These findings enhance our understanding of the evolutionary processes of striated muscles in the animal kingdom.
{"title":"The Ancestor and Evolution of the Giant Muscle Protein Connectin/Titin.","authors":"Akira Hanashima, Yuu Usui, Ken Hashimoto, Satoshi Mohri","doi":"10.1007/s00239-025-10247-7","DOIUrl":"10.1007/s00239-025-10247-7","url":null,"abstract":"<p><p>The emergence of connectin, also called titin, a muscular spring and the largest protein in living organisms, is critical in metazoan evolution as it enables striated muscle-based locomotion. However, its evolutionary history remains unclear. Here, we investigated the evolutionary process using genomic analysis and deduced the ancestor of connectin. The region between the HOX and WNT clusters in the human genome, where the connectin gene (CON (TTN)) is located, was quadrupled by two rounds of whole-genome duplication (WGD) in the ancestor of jawed vertebrates. However, connectin ohnologs were deleted before the advent of jawed vertebrates, resulting in a single connectin gene. Additionally, one of the connectin ohnologs created by the third round of teleost WGD disappeared, while the other was duplicated on the same chromosome. We also discovered that the connectin and connectin family genes consistently underwent local duplication on the same chromosome, though the underlying mechanism remains unknown. Using synteny analysis, we identified KALRN and its ohnolog TRIO as putative ancestral paralogs of the connectin gene. TRIO/KALRN has a connected structure of SESTD1-CCDC141-CON (TTN), and its synteny is conserved in the Drosophila genome. Furthermore, we confirmed that this connected structure, termed 'connectitin,' (connected-connectin/titin) is conserved in cnidarians and placozoans. Molecular phylogenetic analysis revealed that it diverged from TRIO/KALRN prior to the emergence of these animals, suggesting that metazoan muscle may have a single origin. These findings enhance our understanding of the evolutionary processes of striated muscles in the animal kingdom.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"306-321"},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198301/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143997276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-03DOI: 10.1007/s00239-025-10248-6
Chun Wu, Nicholas J Paradis, Khushi Jain
The Ka/Ks ratio test, which assesses nonsynonymous versus synonymous substitution rates in Translated Region (TR) of a genome, is widely used to quantify fitness changes due to mutations but its critical limits are to be addressed. Ka/Ks can categorize the total fitness change as neutral (Ka/Ks = 1), beneficial (Ka/Ks > 1), or deleterious (Ka/Ks < 1), only if synonymous mutations are neutral. Otherwise, Ka/Ks only provides the fitness change due to protein sequence change. This neutrality assumption also renders this test inapplicable to sites in non-protein-coding UnTranslated Region (UTR). Our previous work introduced a substitution-mutation rate ratio (c/µ) per nucleotide site test (c: substitution rate in UTR/TR or a mean value of Ka and Ks in TR; and µ: mutation rate) as a generalized alternative to detect selection pressure, offering a broader application without forementioned presumptions. This paper derives a general equation linking c/µ with weighted Ks/µ and Ka/µ (c/µ = Ps*(Ks/μ) + Pa*(Ka/μ), Ps and Pa: proportions of synonymous and nonsynonymous sites under a mutation model and a codon table), demonstrating that Ka/Ks infers the same fitness change as c/µ does only if synonymous mutations are neutral (i.e. Ks/µ = 1). Otherwise, Ka/Ks might provide a different assignment from the c/µ test. Indeed, our comparative analysis of the c/µ and Ka/Ks tests across 25 proteins of SARS-COV-2 using three independent genomic sequence datasets shows that Ka/Ks inaccurately reports the type of fitness change for 7 proteins. Our findings advocate for the c/µ test to complement traditional Ka/Ks test to detect the selection pressure at a nucleotide site in a genome.
{"title":"Substitution-Mutation Rate Ratio (c/µ) As Molecular Adaptation Test Beyond Ka/Ks: A SARS-COV-2 Case Study.","authors":"Chun Wu, Nicholas J Paradis, Khushi Jain","doi":"10.1007/s00239-025-10248-6","DOIUrl":"10.1007/s00239-025-10248-6","url":null,"abstract":"<p><p>The Ka/Ks ratio test, which assesses nonsynonymous versus synonymous substitution rates in Translated Region (TR) of a genome, is widely used to quantify fitness changes due to mutations but its critical limits are to be addressed. Ka/Ks can categorize the total fitness change as neutral (Ka/Ks = 1), beneficial (Ka/Ks > 1), or deleterious (Ka/Ks < 1), only if synonymous mutations are neutral. Otherwise, Ka/Ks only provides the fitness change due to protein sequence change. This neutrality assumption also renders this test inapplicable to sites in non-protein-coding UnTranslated Region (UTR). Our previous work introduced a substitution-mutation rate ratio (c/µ) per nucleotide site test (c: substitution rate in UTR/TR or a mean value of Ka and Ks in TR; and µ: mutation rate) as a generalized alternative to detect selection pressure, offering a broader application without forementioned presumptions. This paper derives a general equation linking c/µ with weighted Ks/µ and Ka/µ (c/µ = Ps*(Ks/μ) + Pa*(Ka/μ), Ps and Pa: proportions of synonymous and nonsynonymous sites under a mutation model and a codon table), demonstrating that Ka/Ks infers the same fitness change as c/µ does only if synonymous mutations are neutral (i.e. Ks/µ = 1). Otherwise, Ka/Ks might provide a different assignment from the c/µ test. Indeed, our comparative analysis of the c/µ and Ka/Ks tests across 25 proteins of SARS-COV-2 using three independent genomic sequence datasets shows that Ka/Ks inaccurately reports the type of fitness change for 7 proteins. Our findings advocate for the c/µ test to complement traditional Ka/Ks test to detect the selection pressure at a nucleotide site in a genome.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"322-349"},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198311/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144040332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01Epub Date: 2025-04-09DOI: 10.1007/s00239-025-10242-y
Yuan-Yuan Xie, Bin Wen, Ming-Zhu Bai, Yan-Yan Guo
Spliceosomal introns are a key characteristic of eukaryotic genes. However, the origins and mechanisms of new spliceosomal introns remain elusive, and definitive case studies documenting intron creation are still limited. This study examined the RECG1 genes of 49 land plants, including 21 orchids and 28 non-orchid species. Sequence comparison revealed that the fourth intron of Gastrodia and Platanthera (Orchidaceae) is a newly gained spliceosomal intron, originating from the intronization of former exonic sequences. This intronization event was accompanied by the creation of novel recognizable GT/AG splice sites. In contrast, other orchid species lack the corresponding splice sites in the counterpart regions. Moreover, the secondary and tertiary protein structures implied that the intronization events do not affect the protein function. Given the diverse trophic modes of the two genera, we infer that relaxed selection may have contributed to the fluidity of gene structures. This study provides a typical example of de novo lineage-specific intron creation via intronization in orchids supported by multiple lines of evidence, and the two intronization events occurred independently in the same gene. This research enhances our understanding of gene evolution in orchids and provides valuable insights that may assist the annotation of structurally complex genes.
{"title":"De Novo Creation of Two Novel Spliceosomal Introns of RECG1 by Intronization of Formerly Exonic Sequences in Orchidaceae.","authors":"Yuan-Yuan Xie, Bin Wen, Ming-Zhu Bai, Yan-Yan Guo","doi":"10.1007/s00239-025-10242-y","DOIUrl":"10.1007/s00239-025-10242-y","url":null,"abstract":"<p><p>Spliceosomal introns are a key characteristic of eukaryotic genes. However, the origins and mechanisms of new spliceosomal introns remain elusive, and definitive case studies documenting intron creation are still limited. This study examined the RECG1 genes of 49 land plants, including 21 orchids and 28 non-orchid species. Sequence comparison revealed that the fourth intron of Gastrodia and Platanthera (Orchidaceae) is a newly gained spliceosomal intron, originating from the intronization of former exonic sequences. This intronization event was accompanied by the creation of novel recognizable GT/AG splice sites. In contrast, other orchid species lack the corresponding splice sites in the counterpart regions. Moreover, the secondary and tertiary protein structures implied that the intronization events do not affect the protein function. Given the diverse trophic modes of the two genera, we infer that relaxed selection may have contributed to the fluidity of gene structures. This study provides a typical example of de novo lineage-specific intron creation via intronization in orchids supported by multiple lines of evidence, and the two intronization events occurred independently in the same gene. This research enhances our understanding of gene evolution in orchids and provides valuable insights that may assist the annotation of structurally complex genes.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"267-277"},"PeriodicalIF":2.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143810923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01Epub Date: 2025-02-18DOI: 10.1007/s00239-025-10238-8
Qiuhua Xie, Yuange Duan
A-to-I mRNA editing resembles A-to-G mutations. Functional mRNA editing, representing only a corner of total editing events, can be inferred from the experimental removal of editing. However, it is intuitive to ask why evolution chose RNA editing rather than directly (and simply) changing the genomic sequence to G? If G is better than A, then drift or constructive neutral evolution (CNE) theory can explain the emergence of such editing, but it is still unclear why the exemplified conserved editing is perfectly maintained without observing any subsequent A-to-G DNA mutations? Virtually every functional and conserved mRNA editing site faces this ultimate question until one justifies that being editable is better than a hardwired genomic allele. While the advantage of editability has been validated in fungi, this ultimate question has not been answered for any functional editing sites in animals. By providing several conceptual arguments and specific examples, we propose that proving the evolutionary adaptiveness of an editing site is far more difficult than revealing its function.
{"title":"An Ultimate Question for Functional A-to-I mRNA Editing: Why Not a Genomic G?","authors":"Qiuhua Xie, Yuange Duan","doi":"10.1007/s00239-025-10238-8","DOIUrl":"10.1007/s00239-025-10238-8","url":null,"abstract":"<p><p>A-to-I mRNA editing resembles A-to-G mutations. Functional mRNA editing, representing only a corner of total editing events, can be inferred from the experimental removal of editing. However, it is intuitive to ask why evolution chose RNA editing rather than directly (and simply) changing the genomic sequence to G? If G is better than A, then drift or constructive neutral evolution (CNE) theory can explain the emergence of such editing, but it is still unclear why the exemplified conserved editing is perfectly maintained without observing any subsequent A-to-G DNA mutations? Virtually every functional and conserved mRNA editing site faces this ultimate question until one justifies that being editable is better than a hardwired genomic allele. While the advantage of editability has been validated in fungi, this ultimate question has not been answered for any functional editing sites in animals. By providing several conceptual arguments and specific examples, we propose that proving the evolutionary adaptiveness of an editing site is far more difficult than revealing its function.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"185-192"},"PeriodicalIF":2.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143441165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01Epub Date: 2025-04-03DOI: 10.1007/s00239-025-10245-9
Luke R Arnce, Jaclyn E Bubnell, Charles F Aquadro
The protein encoded by the Drosophila melanogaster gene bag of marbles (bam) plays an essential role in early gametogenesis by complexing with the gene product of benign gonial cell neoplasm (bgcn) to promote germline stem cell daughter differentiation in males and females. Here, we compared the AlphaFold2 and AlphaFold Multimer predicted structures of Bam protein and the Bam:Bgcn protein complex between D. melanogaster, D. simulans, and D. yakuba, where bam is necessary in gametogenesis to that in D. teissieri, where it is not. Despite significant sequence divergence, we find very little evidence of significant structural differences in high confidence regions of the structures across the four species. This suggests that Bam structure is unlikely to be a direct cause of its functional differences between species and that Bam may simply not be integrated in an essential manner for GSC differentiation in D. teissieri. Patterns of positive selection and significant amino acid diversification across species is consistent with the Selection, Pleiotropy, and Compensation (SPC) model, where detected selection at bam is consistent with adaptive change in one major trait followed by positively selected compensatory changes for pleiotropic effects (in this case perhaps preserving structure). In the case of bam, we suggest that the major trait could be genetic interaction with the endosymbiotic bacteria Wolbachia pipientis. Following up on detected signals of positive selection and comparative structural analysis could provide insight into the distribution of a primary adaptive change versus compensatory changes following a primary change.
由黑腹果蝇(Drosophila melanogaster)基因bag of marbles (bam)编码的蛋白通过与良性生殖细胞肿瘤(benign gonial cell neoplasm, bgcn)基因产物络合,促进雄性和雌性种系干细胞子细胞分化,在早期配子发生中发挥重要作用。在这里,我们比较了D. melanogaster, D. simulans和D. yakuba的AlphaFold d2和AlphaFold multitimer预测的Bam蛋白和Bam:Bgcn蛋白复合物的结构,其中D. melanogaster, D. simulans和D. yakuba在配子发生中是必需的,而D. teissieri在配子发生中不是必需的。尽管有明显的序列差异,但我们发现在结构的高置信度区域,四个物种之间的结构差异非常小。这表明Bam结构不太可能是物种间功能差异的直接原因,并且Bam可能只是没有以一种必要的方式整合在teissieri的GSC分化中。物种间的正向选择和显著的氨基酸多样化模式与选择、多效性和补偿(SPC)模型一致,其中bam检测到的选择与一个主要性状的适应性变化相一致,随后是多效性效应的正向选择补偿变化(在这种情况下可能是保留结构)。在bam的情况下,我们认为主要性状可能是与内共生细菌沃尔巴克氏体的遗传相互作用。跟踪检测到的积极选择信号和比较结构分析可以深入了解初级适应变化与初级变化后的补偿性变化的分布。
{"title":"Comparative Analysis of Drosophila Bam and Bgcn Sequences and Predicted Protein Structural Evolution.","authors":"Luke R Arnce, Jaclyn E Bubnell, Charles F Aquadro","doi":"10.1007/s00239-025-10245-9","DOIUrl":"10.1007/s00239-025-10245-9","url":null,"abstract":"<p><p>The protein encoded by the Drosophila melanogaster gene bag of marbles (bam) plays an essential role in early gametogenesis by complexing with the gene product of benign gonial cell neoplasm (bgcn) to promote germline stem cell daughter differentiation in males and females. Here, we compared the AlphaFold2 and AlphaFold Multimer predicted structures of Bam protein and the Bam:Bgcn protein complex between D. melanogaster, D. simulans, and D. yakuba, where bam is necessary in gametogenesis to that in D. teissieri, where it is not. Despite significant sequence divergence, we find very little evidence of significant structural differences in high confidence regions of the structures across the four species. This suggests that Bam structure is unlikely to be a direct cause of its functional differences between species and that Bam may simply not be integrated in an essential manner for GSC differentiation in D. teissieri. Patterns of positive selection and significant amino acid diversification across species is consistent with the Selection, Pleiotropy, and Compensation (SPC) model, where detected selection at bam is consistent with adaptive change in one major trait followed by positively selected compensatory changes for pleiotropic effects (in this case perhaps preserving structure). In the case of bam, we suggest that the major trait could be genetic interaction with the endosymbiotic bacteria Wolbachia pipientis. Following up on detected signals of positive selection and comparative structural analysis could provide insight into the distribution of a primary adaptive change versus compensatory changes following a primary change.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"278-291"},"PeriodicalIF":2.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12006264/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143772511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01Epub Date: 2025-02-17DOI: 10.1007/s00239-025-10236-w
C L Molina, M M Magalhães, A C Rodrigues, S A Taniwaki, S O de Souza Silva, G A König, P E Brandão
Due to the COVID-19 pandemic and the uncertainty about aspects of its origin, in recent years there has been an increased interest in investigating coronaviruses in wild animals. Bats are hosts of the greatest diversity of coronaviruses to date, including the ancestors of viruses that have caused outbreaks in humans. Although in Brazil, information on coronaviruses in bats has expanded, still they remain unrepresentative. To help shed some light on this matter, we collected 175 samples from bats of different species from two Brazilian states. Here, we report the previously unknown presence of an alphacoronavirus in a bat (Molossus sp.) from Ceará. The phylogenetic analysis showed close relationships with alphacoronaviruses from Brazil and Argentina, but it was not possible to determine the subgenus or species of this virus using RNA-dependent RNA-polymerase (RdRp) domain of the nsp12 protein-coding sequence as it was distant from the specimens considered by the International Committee on Taxonomy of Viruses (ICTV). Finally, by performing High-Throughput Sequencing, we were able to find contigs mostly belonging to domains of the replicase of bat coronaviruses related to American bats of the Molossidae and Vespertilionidae families.
{"title":"Detection of an Alphacoronavirus in a Brazilian Bat (Molossus sp.).","authors":"C L Molina, M M Magalhães, A C Rodrigues, S A Taniwaki, S O de Souza Silva, G A König, P E Brandão","doi":"10.1007/s00239-025-10236-w","DOIUrl":"10.1007/s00239-025-10236-w","url":null,"abstract":"<p><p>Due to the COVID-19 pandemic and the uncertainty about aspects of its origin, in recent years there has been an increased interest in investigating coronaviruses in wild animals. Bats are hosts of the greatest diversity of coronaviruses to date, including the ancestors of viruses that have caused outbreaks in humans. Although in Brazil, information on coronaviruses in bats has expanded, still they remain unrepresentative. To help shed some light on this matter, we collected 175 samples from bats of different species from two Brazilian states. Here, we report the previously unknown presence of an alphacoronavirus in a bat (Molossus sp.) from Ceará. The phylogenetic analysis showed close relationships with alphacoronaviruses from Brazil and Argentina, but it was not possible to determine the subgenus or species of this virus using RNA-dependent RNA-polymerase (RdRp) domain of the nsp12 protein-coding sequence as it was distant from the specimens considered by the International Committee on Taxonomy of Viruses (ICTV). Finally, by performing High-Throughput Sequencing, we were able to find contigs mostly belonging to domains of the replicase of bat coronaviruses related to American bats of the Molossidae and Vespertilionidae families.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"257-266"},"PeriodicalIF":2.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143441168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}