Pub Date : 2024-10-01Epub Date: 2024-07-17DOI: 10.1007/s00239-024-10188-7
Freya Kailing, Jules Lieberman, Joshua Wang, Joshua L Turner, Aaron D Goldman
Current evidence suggests that some form of cellular organization arose well before the time of the last universal common ancestor (LUCA). Standard phylogenetic analyses have shown that several protein families associated with membrane translocation, membrane transport, and membrane bioenergetics were very likely present in the proteome of the LUCA. Despite these cellular systems emerging prior to the LUCA, extant archaea, bacteria, and eukaryotes have significant differences in cellular infrastructure and the molecular functions that support it, leading some researchers to argue that true cellularity did not evolve until after the LUCA. Here, we use recently reconstructed minimal proteomes of the LUCA as well as the last archaeal common ancestor (LACA) and the last bacterial common ancestor (LBCA) to characterize the evolution of cellular systems along the first branches of the tree of life. We find that a broad set of functions associated with cellular organization were already present by the time of the LUCA. The functional repertoires of the LACA and LBCA related to cellular organization nearly doubled along each branch following the divergence of the LUCA. These evolutionary trends created the foundation for similarities and differences in cellular organization between the taxonomic domains that are still observed today.
{"title":"Evolution of Cellular Organization Along the First Branches of the Tree of Life.","authors":"Freya Kailing, Jules Lieberman, Joshua Wang, Joshua L Turner, Aaron D Goldman","doi":"10.1007/s00239-024-10188-7","DOIUrl":"10.1007/s00239-024-10188-7","url":null,"abstract":"<p><p>Current evidence suggests that some form of cellular organization arose well before the time of the last universal common ancestor (LUCA). Standard phylogenetic analyses have shown that several protein families associated with membrane translocation, membrane transport, and membrane bioenergetics were very likely present in the proteome of the LUCA. Despite these cellular systems emerging prior to the LUCA, extant archaea, bacteria, and eukaryotes have significant differences in cellular infrastructure and the molecular functions that support it, leading some researchers to argue that true cellularity did not evolve until after the LUCA. Here, we use recently reconstructed minimal proteomes of the LUCA as well as the last archaeal common ancestor (LACA) and the last bacterial common ancestor (LBCA) to characterize the evolution of cellular systems along the first branches of the tree of life. We find that a broad set of functions associated with cellular organization were already present by the time of the LUCA. The functional repertoires of the LACA and LBCA related to cellular organization nearly doubled along each branch following the divergence of the LUCA. These evolutionary trends created the foundation for similarities and differences in cellular organization between the taxonomic domains that are still observed today.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"618-623"},"PeriodicalIF":2.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458647/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141633757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-18DOI: 10.1007/s00239-024-10187-8
Luis Delaye
The ultimate consequence of Darwin's theory of common descent implies that all life on earth descends ultimately from a common ancestor. Biochemistry and molecular biology now provide sufficient evidence of shared ancestry of all extant life forms. However, the nature of the Last Universal Common Ancestor (LUCA) has been a topic of much debate over the years. This review offers a historical perspective on different attempts to infer LUCA's nature, exploring the debate surrounding its complexity. We further examine how different methodologies identify sets of ancient protein that exhibit only partial overlap. For example, different bioinformatic approaches have identified distinct protein subunits from the ATP synthetase identified as potentially inherited from LUCA. Additionally, we discuss how detailed molecular evolutionary analysis of reverse gyrase has modified previous inferences about an hyperthermophilic LUCA based mainly on automatic bioinformatic pipelines. We conclude by emphasizing the importance of developing a database dedicated to studying genes and proteins traceable back to LUCA and earlier stages of cellular evolution. Such a database would house the most ancient genes on earth.
达尔文共同后裔理论的最终结果意味着,地球上的所有生命最终都是共同祖先的后裔。现在,生物化学和分子生物学提供了所有现存生命形式共同祖先的充分证据。然而,"最后的宇宙共同祖先"(LUCA)的性质多年来一直是一个争论不休的话题。本综述从历史的角度探讨了推断 LUCA 性质的不同尝试,探讨了围绕其复杂性的争论。我们还将进一步探讨不同的方法是如何识别仅表现出部分重叠的古蛋白质集的。例如,不同的生物信息学方法从 ATP 合成酶中识别出了不同的蛋白质亚基,这些亚基可能是从 LUCA 继承而来。此外,我们还讨论了反向回旋酶的详细分子进化分析如何改变了之前主要基于自动生物信息学管道对嗜热LUCA的推断。最后,我们强调了开发一个专门用于研究可追溯到 LUCA 和细胞进化早期阶段的基因和蛋白质的数据库的重要性。这样一个数据库将储存地球上最古老的基因。
{"title":"The Unfinished Reconstructed Nature of the Last Universal Common Ancestor.","authors":"Luis Delaye","doi":"10.1007/s00239-024-10187-8","DOIUrl":"10.1007/s00239-024-10187-8","url":null,"abstract":"<p><p>The ultimate consequence of Darwin's theory of common descent implies that all life on earth descends ultimately from a common ancestor. Biochemistry and molecular biology now provide sufficient evidence of shared ancestry of all extant life forms. However, the nature of the Last Universal Common Ancestor (LUCA) has been a topic of much debate over the years. This review offers a historical perspective on different attempts to infer LUCA's nature, exploring the debate surrounding its complexity. We further examine how different methodologies identify sets of ancient protein that exhibit only partial overlap. For example, different bioinformatic approaches have identified distinct protein subunits from the ATP synthetase identified as potentially inherited from LUCA. Additionally, we discuss how detailed molecular evolutionary analysis of reverse gyrase has modified previous inferences about an hyperthermophilic LUCA based mainly on automatic bioinformatic pipelines. We conclude by emphasizing the importance of developing a database dedicated to studying genes and proteins traceable back to LUCA and earlier stages of cellular evolution. Such a database would house the most ancient genes on earth.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"584-592"},"PeriodicalIF":2.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141723741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-09-08DOI: 10.1007/s00239-024-10201-z
Gregory P Fournier
Abiogenesis is frequently envisioned as a linear, ladder-like progression of increasingly complex chemical systems, eventually leading to the ancestors of extant cellular life. This "pre-cladistics" view is in stark contrast to the well-accepted principles of organismal evolutionary biology, as informed by paleontology and phylogenetics. Applying this perspective to origins, I explore the paradigm of "Stem Life," which embeds abiogenesis within a broader continuity of diversification and extinction of both hereditary lineages and chemical systems. In this new paradigm, extant life's ancestral lineage emerged alongside and was dependent upon many other complex prebiotic chemical systems, as part of a diverse and fecund prebiosphere. Drawing from several natural history analogies, I show how this shift in perspective enriches our understanding of Origins and directly informs debates on defining Life, the emergence of the Last Universal Common Ancestor (LUCA), and the implications of prebiotic chemical experiments.
{"title":"Stem Life: A Framework for Understanding the Prebiotic-Biotic Transition.","authors":"Gregory P Fournier","doi":"10.1007/s00239-024-10201-z","DOIUrl":"10.1007/s00239-024-10201-z","url":null,"abstract":"<p><p>Abiogenesis is frequently envisioned as a linear, ladder-like progression of increasingly complex chemical systems, eventually leading to the ancestors of extant cellular life. This \"pre-cladistics\" view is in stark contrast to the well-accepted principles of organismal evolutionary biology, as informed by paleontology and phylogenetics. Applying this perspective to origins, I explore the paradigm of \"Stem Life,\" which embeds abiogenesis within a broader continuity of diversification and extinction of both hereditary lineages and chemical systems. In this new paradigm, extant life's ancestral lineage emerged alongside and was dependent upon many other complex prebiotic chemical systems, as part of a diverse and fecund prebiosphere. Drawing from several natural history analogies, I show how this shift in perspective enriches our understanding of Origins and directly informs debates on defining Life, the emergence of the Last Universal Common Ancestor (LUCA), and the implications of prebiotic chemical experiments.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"539-549"},"PeriodicalIF":2.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142154370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-09DOI: 10.1007/s00239-024-10193-w
Aaron D Goldman, Arturo Becerra
{"title":"A New View of the Last Universal Common Ancestor.","authors":"Aaron D Goldman, Arturo Becerra","doi":"10.1007/s00239-024-10193-w","DOIUrl":"10.1007/s00239-024-10193-w","url":null,"abstract":"","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"659-661"},"PeriodicalIF":2.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458664/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141912997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-15DOI: 10.1007/s00239-024-10194-9
Wolfgang Cottom-Salas, Arturo Becerra, Antonio Lazcano
One of the central issues in the understanding of early cellular evolution is the characterisation of the cenancestor. This includes the description of the chemical nature of its genome. The disagreements on this question comprise several proposals, including the possibility that AlkB-mediated methylation repair of alkylated RNA molecules may be interpreted as evidence of a cenancestral RNA genome. We present here an evolutionary analysis of the cupin-like protein superfamily based on tertiary structure-based phylogenies that includes the oxygen-dependent AlkB and its homologs. Our results suggest that the repair of methylated RNA molecules is the outcome of the enzyme substrate ambiguity, and doesn´t necessarily indicates that the last common ancestor was endowed with an RNA genome.
{"title":"RNA or DNA? Revisiting the Chemical Nature of the Cenancestral Genome.","authors":"Wolfgang Cottom-Salas, Arturo Becerra, Antonio Lazcano","doi":"10.1007/s00239-024-10194-9","DOIUrl":"10.1007/s00239-024-10194-9","url":null,"abstract":"<p><p>One of the central issues in the understanding of early cellular evolution is the characterisation of the cenancestor. This includes the description of the chemical nature of its genome. The disagreements on this question comprise several proposals, including the possibility that AlkB-mediated methylation repair of alkylated RNA molecules may be interpreted as evidence of a cenancestral RNA genome. We present here an evolutionary analysis of the cupin-like protein superfamily based on tertiary structure-based phylogenies that includes the oxygen-dependent AlkB and its homologs. Our results suggest that the repair of methylated RNA molecules is the outcome of the enzyme substrate ambiguity, and doesn´t necessarily indicates that the last common ancestor was endowed with an RNA genome.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"647-658"},"PeriodicalIF":2.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458739/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141982491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1007/s00239-024-10207-7
Rodrigo Jácome
Many polymerases and other proteins are endowed with a catalytic domain belonging to the nucleotidyltransferase fold, which has also been deemed the non-canonical palm domain, in which three conserved acidic residues coordinate two divalent metal ions. Tertiary structure-based evolutionary analyses provide valuable information when the phylogenetic signal contained in the primary structure is blurry or has been lost, as is the case with these proteins. Pairwise structural comparisons of proteins with a nucleotidyltransferase fold were performed in the PDBefold web server: the RMSD, the number of superimposed residues, and the Qscore were obtained. The structural alignment score (RMSD × 100/number of superimposed residues) and the 1-Qscore were calculated, and distance matrices were constructed, from which a dendogram and a phylogenetic network were drawn for each score. The dendograms and the phylogenetic networks display well-defined clades, reflecting high levels of structural conservation within each clade, not mirrored by primary sequence. The conserved structural core between all these proteins consists of the catalytic nucleotidyltransferase fold, which is surrounded by different functional domains. Hence, many of the clades include proteins that bind different substrates or partake in non-related functions. Enzymes endowed with a nucleotidyltransferase fold are present in all domains of life, and participate in essential cellular and viral functions, which suggests that this domain is very ancient. Despite the loss of evolutionary traces in their primary structure, tertiary structure-based analyses allow us to delve into the evolution and functional diversification of the NT fold.
{"title":"Structural and Evolutionary Analysis of Proteins Endowed with a Nucleotidyltransferase, or Non-canonical Palm, Catalytic Domain","authors":"Rodrigo Jácome","doi":"10.1007/s00239-024-10207-7","DOIUrl":"https://doi.org/10.1007/s00239-024-10207-7","url":null,"abstract":"<p>Many polymerases and other proteins are endowed with a catalytic domain belonging to the nucleotidyltransferase fold, which has also been deemed the non-canonical palm domain, in which three conserved acidic residues coordinate two divalent metal ions. Tertiary structure-based evolutionary analyses provide valuable information when the phylogenetic signal contained in the primary structure is blurry or has been lost, as is the case with these proteins. Pairwise structural comparisons of proteins with a nucleotidyltransferase fold were performed in the PDBefold web server: the RMSD, the number of superimposed residues, and the Qscore were obtained. The structural alignment score (RMSD × 100/number of superimposed residues) and the 1-Qscore were calculated, and distance matrices were constructed, from which a dendogram and a phylogenetic network were drawn for each score. The dendograms and the phylogenetic networks display well-defined clades, reflecting high levels of structural conservation within each clade, not mirrored by primary sequence. The conserved structural core between all these proteins consists of the catalytic nucleotidyltransferase fold, which is surrounded by different functional domains. Hence, many of the clades include proteins that bind different substrates or partake in non-related functions. Enzymes endowed with a nucleotidyltransferase fold are present in all domains of life, and participate in essential cellular and viral functions, which suggests that this domain is very ancient. Despite the loss of evolutionary traces in their primary structure, tertiary structure-based analyses allow us to delve into the evolution and functional diversification of the NT fold.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":"32 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1007/s00239-024-10200-0
Guillermina Hill-Terán, Julieta Petrich, Maria Lorena Falcone Ferreyra, Manuel J. Aybar, Gabriela Coux
Treacher Collins syndrome (TCS) is a genetic disorder affecting facial development, primarily caused by mutations in the TCOF1 gene. TCOF1, along with NOLC1, play important roles in ribosomal RNA transcription and processing. Previously, a zebrafish model of TCS successfully recapitulated the main characteristics of the syndrome by knocking down the expression of a gene on chromosome 13 (coding for Uniprot ID B8JIY2), which was identified as the TCOF1 orthologue. However, database updates renamed this gene as nolc1 and the zebrafish database (ZFIN) identified a different gene on chromosome 14 as the TCOF1 orthologue (coding for Uniprot ID E7F9D9). NOLC1 and TCOF1 are large proteins with unstructured regions and repetitive sequences that complicate alignments and comparisons. Also, the additional whole genome duplication of teleosts sets further difficulty. In this study, we present evidence that endorses that NOLC1 and TCOF1 are paralogs, and that the zebrafish gene on chromosome 14 is a low-complexity LisH domain-containing factor that displays homology to NOLC1 but lacks essential sequence features to accomplish TCOF1 nucleolar functions. Our analysis also supports the idea that zebrafish, as has been suggested for other non-tetrapod vertebrates, lack the TCOF1 gene that is associated with tripartite nucleolus. Using BLAST searches in a group of teleost genomes, we identified fish-specific sequences similar to E7F9D9 zebrafish protein. We propose naming them “LisH-containing Low Complexity Proteins” (LLCP). Interestingly, the gene on chromosome 13 (nolc1) displays the sequence features, developmental expression patterns, and phenotypic impact of depletion that are characteristic of TCOF1 functions. These findings suggest that in teleost fish, the nucleolar functions described for both NOLC1 and TCOF1 mediated by their repeated motifs, are carried out by a single gene, nolc1. Our study, which is mainly based on computational tools available as free web-based algorithms, could help to solve similar conflicts regarding gene orthology in zebrafish.
{"title":"Untangling Zebrafish Genetic Annotation: Addressing Complexities and Nomenclature Issues in Orthologous Evaluation of TCOF1 and NOLC1","authors":"Guillermina Hill-Terán, Julieta Petrich, Maria Lorena Falcone Ferreyra, Manuel J. Aybar, Gabriela Coux","doi":"10.1007/s00239-024-10200-0","DOIUrl":"https://doi.org/10.1007/s00239-024-10200-0","url":null,"abstract":"<p>Treacher Collins syndrome (TCS) is a genetic disorder affecting facial development, primarily caused by mutations in the <i>TCOF1</i> gene. TCOF1, along with NOLC1, play important roles in ribosomal RNA transcription and processing. Previously, a zebrafish model of TCS successfully recapitulated the main characteristics of the syndrome by knocking down the expression of a gene on chromosome 13 (coding for Uniprot ID B8JIY2), which was identified as the <i>TCOF1</i> orthologue. However, database updates renamed this gene as <i>nolc1</i> and the zebrafish database (ZFIN) identified a different gene on chromosome 14 as the <i>TCOF1</i> orthologue (coding for Uniprot ID E7F9D9). NOLC1 and TCOF1 are large proteins with unstructured regions and repetitive sequences that complicate alignments and comparisons. Also, the additional whole genome duplication of teleosts sets further difficulty. In this study, we present evidence that endorses that <i>NOLC1</i> and <i>TCOF1</i> are paralogs, and that the zebrafish gene on chromosome 14 is a low-complexity LisH domain-containing factor that displays homology to NOLC1 but lacks essential sequence features to accomplish TCOF1 nucleolar functions. Our analysis also supports the idea that zebrafish, as has been suggested for other non-tetrapod vertebrates, lack the <i>TCOF1</i> gene that is associated with tripartite nucleolus. Using BLAST searches in a group of teleost genomes, we identified fish-specific sequences similar to E7F9D9 zebrafish protein. We propose naming them “LisH-containing Low Complexity Proteins” (LLCP). Interestingly, the gene on chromosome 13 <i>(nolc1</i>) displays the sequence features, developmental expression patterns, and phenotypic impact of depletion that are characteristic of <i>TCOF1</i> functions. These findings suggest that in teleost fish, the nucleolar functions described for both <i>NOLC1</i> and <i>TCOF1</i> mediated by their repeated motifs, are carried out by a single gene, <i>nolc1</i>. Our study, which is mainly based on computational tools available as free web-based algorithms, could help to solve similar conflicts regarding gene orthology in zebrafish.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":"5 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1007/s00239-024-10192-x
J. A. Carlisle, D. H. Gurbuz, W. J. Swanson
Many reproductive proteins show signatures of rapid evolution through sequence divergence and duplication. These features of reproductive genes may complicate the detection of orthologs across taxa, making it difficult to connect studies in model systems to human biology. In mice, ZP3r/sp56 is a binding partner to the egg coat protein ZP3 and may mediate induction of the acrosome reaction, a crucial step in fertilization. In rodents, ZP3r, as a member of the Regulators of Complement Activation cluster, is surrounded by paralogs, some of which have been shown to be evolving under positive selection. Although primate egg coats also contain ZP3, sequence divergence paired with paralogous relationships with neighboring genes has complicated the accurate identification of the human ZP3r ortholog. Here, we phylogenetically and syntenically resolve that the human ortholog of ZP3r is the pseudogene C4BPAP1. We investigate the evolution of this gene within primates. We observe independent pseudogenization events of ZP3r in all Apes with the exception of Orangutans, and independent pseudogenization events in many monkey species. ZP3r in both primates that retain ZP3r and in rodents contains positively selected sites. We hypothesize that redundant mechanisms mediate ZP3 recognition in mammals and ZP3r’s relative importance to ZP recognition varies across species.
{"title":"Recurrent Independent Pseudogenization Events of the Sperm Fertilization Gene ZP3r in Apes and Monkeys","authors":"J. A. Carlisle, D. H. Gurbuz, W. J. Swanson","doi":"10.1007/s00239-024-10192-x","DOIUrl":"https://doi.org/10.1007/s00239-024-10192-x","url":null,"abstract":"<p>Many reproductive proteins show signatures of rapid evolution through sequence divergence and duplication. These features of reproductive genes may complicate the detection of orthologs across taxa, making it difficult to connect studies in model systems to human biology. In mice, ZP3r/sp56 is a binding partner to the egg coat protein ZP3 and may mediate induction of the acrosome reaction, a crucial step in fertilization. In rodents, ZP3r, as a member of the Regulators of Complement Activation cluster, is surrounded by paralogs, some of which have been shown to be evolving under positive selection. Although primate egg coats also contain ZP3, sequence divergence paired with paralogous relationships with neighboring genes has complicated the accurate identification of the human ZP3r ortholog. Here, we phylogenetically and syntenically resolve that the human ortholog of ZP3r is the pseudogene <i>C4BPAP1</i>. We investigate the evolution of this gene within primates. We observe independent pseudogenization events of ZP3r in all Apes with the exception of Orangutans, and independent pseudogenization events in many monkey species. ZP3r in both primates that retain ZP3r and in rodents contains positively selected sites. We hypothesize that redundant mechanisms mediate ZP3 recognition in mammals and ZP3r’s relative importance to ZP recognition varies across species.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":"25 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1007/s00239-024-10198-5
Shinde Nikhil, Habeeb Shaikh Mohideen, Raja Natesan Sella
Sorghum (Sorghum bicolor (L.) Moench) is a multipurpose crop grown for food, fodder, and bioenergy production. Its cultivated varieties, along with their wild counterparts, contribute to the core genetic pool. Despite the availability of several re-sequenced sorghum genomes, a variable portion of sorghum genomes is not reported during reference genome assembly and annotation. The present analysis used 223 publicly available RNA-seq datasets from seven sweet sorghum cultivars to construct superTranscriptome. This approach yielded 45,864 Representative Transcript Assemblies (RTAs) that showcased intriguing Presence/Absence Variation (PAV) across 15 published sorghum genomes. We found 301 superTranscripts were exclusive to sweet sorghum, including 58 de novo genes encoded core and linker histones, zinc finger domains, glucosyl transferases, cellulose synthase, etc. The superTranscriptome added 2,802 new protein-coding genes to the Sweet Sorghum Reference Genome (SSRG), of which 559 code for different transcription factors (TFs). Our analysis revealed that MULE-like transposases were abundant in the sweet sorghum genome and could play a hidden role in the evolution of sweet sorghum. We observed large deletions in the D locus and terminal deletions in four other NAC encoding loci in the SSRG compared to its wild progenitor (353) suggesting non-functional NAC genes contributed to trait development in sweet sorghum. Moreover, superTranscript-based methods for Differential Exon Usage (DEU) and Differential Gene Expression (DGE) analyses were more accurate than those based on the SSRG. This study demonstrates that the superTranscriptome can enhance our understanding of fundamental sorghum mechanisms, improve genome annotations, and potentially even replace the reference genome.
{"title":"Unveiling the Genomic Symphony: Identification Cultivar-Specific Genes and Enhanced Insights on Sweet Sorghum Genomes Through Comprehensive superTranscriptomic Analysis","authors":"Shinde Nikhil, Habeeb Shaikh Mohideen, Raja Natesan Sella","doi":"10.1007/s00239-024-10198-5","DOIUrl":"https://doi.org/10.1007/s00239-024-10198-5","url":null,"abstract":"<p>Sorghum (<i>Sorghum bicolor (L.) Moench</i>) is a multipurpose crop grown for food, fodder, and bioenergy production. Its cultivated varieties, along with their wild counterparts, contribute to the core genetic pool. Despite the availability of several re-sequenced sorghum genomes, a variable portion of sorghum genomes is not reported during reference genome assembly and annotation. The present analysis used 223 publicly available RNA-seq datasets from seven sweet sorghum cultivars to construct superTranscriptome. This approach yielded 45,864 Representative Transcript Assemblies (RTAs) that showcased intriguing Presence/Absence Variation (PAV) across 15 published sorghum genomes. We found 301 superTranscripts were exclusive to sweet sorghum, including 58 de novo genes encoded core and linker histones, zinc finger domains, glucosyl transferases, cellulose synthase, etc. The superTranscriptome added 2,802 new protein-coding genes to the Sweet Sorghum Reference Genome (SSRG), of which 559 code for different transcription factors (TFs). Our analysis revealed that MULE-like transposases were abundant in the sweet sorghum genome and could play a hidden role in the evolution of sweet sorghum. We observed large deletions in the D locus and terminal deletions in four other NAC encoding loci in the SSRG compared to its wild progenitor (353) suggesting non-functional NAC genes contributed to trait development in sweet sorghum. Moreover, superTranscript-based methods for Differential Exon Usage (DEU) and Differential Gene Expression (DGE) analyses were more accurate than those based on the SSRG. This study demonstrates that the superTranscriptome can enhance our understanding of fundamental sorghum mechanisms, improve genome annotations, and potentially even replace the reference genome.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":"12 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1007/s00239-024-10199-4
Mario Rivas, George E. Fox
The Last Common Ancestor (LCA) is understood as a hypothetical population of organisms from which all extant living creatures are thought to have descended. Its biology and environment have been and continue to be the subject of discussions within the scientific community. Since the first bacterial genomes were obtained, multiple attempts to reconstruct the genetic content of the LCA have been made. In this review, we compare 10 of the most extensive reconstructions of the gene content possessed by the LCA as they relate to aspects of the translation machinery. Although each reconstruction has its own methodological biases and many disagree in the metabolic nature of the LCA all, to some extent, indicate that several components of the translation machinery are among the most conserved genetic elements. The datasets from each reconstruction clearly show that the LCA already had a largely complete translational system with a genetic code already in place and therefore was not a progenote. Among these features several ribosomal proteins, transcription factors like IF2, EF-G, and EF-Tu and both class I and class II aminoacyl tRNA synthetases were found in essentially all reconstructions. Due to the limitations of the various methodologies, some features such as the occurrence of rRNA posttranscriptional modified bases are not fully addressed. However, conserved as it is, non-universal ribosomal features found in various reconstructions indicate that LCA’s translation machinery was still evolving, thereby acquiring the domain specific features in the process. Although progenotes from the pre-LCA likely no longer exist recent results obtained by unraveling the early history of the ribosome and other genetic processes can provide insight to the nature of the pre-LCA world.
{"title":"On the Nature of the Last Common Ancestor: A Story from its Translation Machinery","authors":"Mario Rivas, George E. Fox","doi":"10.1007/s00239-024-10199-4","DOIUrl":"https://doi.org/10.1007/s00239-024-10199-4","url":null,"abstract":"<p>The Last Common Ancestor (LCA) is understood as a hypothetical population of organisms from which all extant living creatures are thought to have descended. Its biology and environment have been and continue to be the subject of discussions within the scientific community. Since the first bacterial genomes were obtained, multiple attempts to reconstruct the genetic content of the LCA have been made. In this review, we compare 10 of the most extensive reconstructions of the gene content possessed by the LCA as they relate to aspects of the translation machinery. Although each reconstruction has its own methodological biases and many disagree in the metabolic nature of the LCA all, to some extent, indicate that several components of the translation machinery are among the most conserved genetic elements. The datasets from each reconstruction clearly show that the LCA already had a largely complete translational system with a genetic code already in place and therefore was not a <i>progenote</i>. Among these features several ribosomal proteins, transcription factors like IF2, EF-G, and EF-Tu and both class I and class II aminoacyl tRNA synthetases were found in essentially all reconstructions. Due to the limitations of the various methodologies, some features such as the occurrence of rRNA posttranscriptional modified bases are not fully addressed. However, conserved as it is, non-universal ribosomal features found in various reconstructions indicate that LCA’s translation machinery was still evolving, thereby acquiring the domain specific features in the process. Although progenotes from the pre-LCA likely no longer exist recent results obtained by unraveling the early history of the ribosome and other genetic processes can provide insight to the nature of the pre-LCA world.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":"55 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}