Gemma I Martínez-Redondo, Carlos Vargas-Chávez, Klara Eleftheriadi, Lisandra Benítez-Álvarez, Marçal Vázquez-Valls, Rosa Fernández
Recent advances in high-throughput sequencing have exponentially increased the number of genomic data available for animals (Metazoa) in the last decades, with high-quality chromosome-level genomes being published almost daily. Nevertheless, generating a new genome is not an easy task due to the high cost of genome sequencing, the high complexity of assembly, and the lack of standardized protocols for genome annotation. The lack of consensus in the annotation and publication of genome files hinders research by making researchers lose time in reformatting the files for their purposes but can also reduce the quality of the genetic repertoire for an evolutionary study. Thus, the use of transcriptomes obtained using the same pipeline as a proxy for the genetic content of species remains a valuable resource that is easier to obtain, cheaper, and more comparable than genomes. In a previous study, we presented the Metazoan Assemblies from Transcriptomic Ensembles database (MATEdb), a repository of high-quality transcriptomic and genomic data for the two most diverse animal phyla, Arthropoda and Mollusca. Here, we present the newest version of MATEdb (MATEdb2) that overcomes some of the previous limitations of our database: (i) we include data from all animal phyla where public data are available, and (ii) we provide gene annotations extracted from the original GFF genome files using the same pipeline. In total, we provide proteomes inferred from high-quality transcriptomic or genomic data for almost 1,000 animal species, including the longest isoforms, all isoforms, and functional annotation based on sequence homology and protein language models, as well as the embedding representations of the sequences. We believe this new version of MATEdb will accelerate research on animal phylogenomics while saving thousands of hours of computational work in a plea for open, greener, and collaborative science.
{"title":"MATEdb2, a Collection of High-Quality Metazoan Proteomes across the Animal Tree of Life to Speed Up Phylogenomic Studies.","authors":"Gemma I Martínez-Redondo, Carlos Vargas-Chávez, Klara Eleftheriadi, Lisandra Benítez-Álvarez, Marçal Vázquez-Valls, Rosa Fernández","doi":"10.1093/gbe/evae235","DOIUrl":"10.1093/gbe/evae235","url":null,"abstract":"<p><p>Recent advances in high-throughput sequencing have exponentially increased the number of genomic data available for animals (Metazoa) in the last decades, with high-quality chromosome-level genomes being published almost daily. Nevertheless, generating a new genome is not an easy task due to the high cost of genome sequencing, the high complexity of assembly, and the lack of standardized protocols for genome annotation. The lack of consensus in the annotation and publication of genome files hinders research by making researchers lose time in reformatting the files for their purposes but can also reduce the quality of the genetic repertoire for an evolutionary study. Thus, the use of transcriptomes obtained using the same pipeline as a proxy for the genetic content of species remains a valuable resource that is easier to obtain, cheaper, and more comparable than genomes. In a previous study, we presented the Metazoan Assemblies from Transcriptomic Ensembles database (MATEdb), a repository of high-quality transcriptomic and genomic data for the two most diverse animal phyla, Arthropoda and Mollusca. Here, we present the newest version of MATEdb (MATEdb2) that overcomes some of the previous limitations of our database: (i) we include data from all animal phyla where public data are available, and (ii) we provide gene annotations extracted from the original GFF genome files using the same pipeline. In total, we provide proteomes inferred from high-quality transcriptomic or genomic data for almost 1,000 animal species, including the longest isoforms, all isoforms, and functional annotation based on sequence homology and protein language models, as well as the embedding representations of the sequences. We believe this new version of MATEdb will accelerate research on animal phylogenomics while saving thousands of hours of computational work in a plea for open, greener, and collaborative science.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":"16 11","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Zhang, Iseult Leahy, Jérȏme Collemare, Michael F Seidl
Fungi are well-known producers of bioactive secondary metabolites (SMs), which have been exploited for decades by humankind for various medical applications like therapeutics and antibiotics. SMs are synthesized by biosynthetic gene clusters (BGCs)-physically co-localized and co-regulated genes. Because BGCs are often regulated by histone post-translational modifications (PTMs), it was suggested that their chromosomal location is important for their expression. Studies in a few fungal species indicated an enrichment of BGCs in sub-telomeric regions; however, there is no evidence that BGCs with distinct genomic localization are regulated by different histone PTMs. Here, we used 174 Aspergillus species covering 22 sections to determine the correlation between BGC genomic localization, gene expression, and histone PTMs. We found a high abundance and diversity of SM backbone genes across the Aspergillus genus, with notable unique genes within sections. Being unique or conserved in many species, BGCs showed a strong bias for being localized in low-synteny regions, regardless of their position in chromosomes. Using chromosome-level assemblies, we also confirmed a significantly biased localization in sub-telomeric regions. Notably, SM backbone genes in sub-telomeric regions and about half of those in low-synteny regions exhibit higher gene expression variability, likely due to the similar higher variability in H3K4me3 and H3K36me3 histone PTMs; while variations in histone H3 acetylation and H3K9me3 are not correlated to genomic localization and expression variation, as analyzed in two Aspergillus species. Expression variability across four Aspergillus species further supports that BGCs tend to be located in low-synteny regions and that regulation of expression in those regions likely involves different histone PTMs than the most commonly studied modifications.
{"title":"Genomic Localization Bias of Secondary Metabolite Gene Clusters and Association with Histone Modifications in Aspergillus.","authors":"Xin Zhang, Iseult Leahy, Jérȏme Collemare, Michael F Seidl","doi":"10.1093/gbe/evae228","DOIUrl":"10.1093/gbe/evae228","url":null,"abstract":"<p><p>Fungi are well-known producers of bioactive secondary metabolites (SMs), which have been exploited for decades by humankind for various medical applications like therapeutics and antibiotics. SMs are synthesized by biosynthetic gene clusters (BGCs)-physically co-localized and co-regulated genes. Because BGCs are often regulated by histone post-translational modifications (PTMs), it was suggested that their chromosomal location is important for their expression. Studies in a few fungal species indicated an enrichment of BGCs in sub-telomeric regions; however, there is no evidence that BGCs with distinct genomic localization are regulated by different histone PTMs. Here, we used 174 Aspergillus species covering 22 sections to determine the correlation between BGC genomic localization, gene expression, and histone PTMs. We found a high abundance and diversity of SM backbone genes across the Aspergillus genus, with notable unique genes within sections. Being unique or conserved in many species, BGCs showed a strong bias for being localized in low-synteny regions, regardless of their position in chromosomes. Using chromosome-level assemblies, we also confirmed a significantly biased localization in sub-telomeric regions. Notably, SM backbone genes in sub-telomeric regions and about half of those in low-synteny regions exhibit higher gene expression variability, likely due to the similar higher variability in H3K4me3 and H3K36me3 histone PTMs; while variations in histone H3 acetylation and H3K9me3 are not correlated to genomic localization and expression variation, as analyzed in two Aspergillus species. Expression variability across four Aspergillus species further supports that BGCs tend to be located in low-synteny regions and that regulation of expression in those regions likely involves different histone PTMs than the most commonly studied modifications.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11542625/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The origins and early evolution of animals are subjects with many outstanding questions. One problem faced by researchers trying to answer them is the absence of a comprehensive database with sequences from nonbilaterians. Publicly available data are plentiful but scattered and often not associated with proper metadata. A new database presented in this paper, LukProt, is an attempt at solving this issue. The database contains protein sequences obtained mostly from genomic, transcriptomic, and metagenomic studies and is an extension of EukProt (Richter DJ, Berney C, Strassert JFH, Poh Y-P, Herman EK, Muñoz-Gómez SA, Wideman JG, Burki F, de Vargas C. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community J. 2022:2:e56. https://doi.org/10.24072/pcjournal.173). LukProt adopts the EukProt naming conventions and includes data from 216 additional animals. The database is associated with a taxonomic grouping (taxogroup) scheme suitable for studying early animal evolution. Minor updates to the database will contain species additions or metadata corrections, whereas major updates will synchronize LukProt to each new version of EukProt, and releases are permanently stored on Zenodo (https://doi.org/10.5281/zenodo.7089120). A BLAST server to search the database is available at: https://lukprot.hirszfeld.pl/. Users are invited to participate in maintaining and correcting LukProt. As it can be searched without downloading locally, the database aims to be a convenient resource not only for evolutionary biologists, but for the broader scientific community as well.
{"title":"LukProt: A Database of Eukaryotic Predicted Proteins Designed for Investigations of Animal Origins.","authors":"Łukasz F Sobala","doi":"10.1093/gbe/evae231","DOIUrl":"10.1093/gbe/evae231","url":null,"abstract":"<p><p>The origins and early evolution of animals are subjects with many outstanding questions. One problem faced by researchers trying to answer them is the absence of a comprehensive database with sequences from nonbilaterians. Publicly available data are plentiful but scattered and often not associated with proper metadata. A new database presented in this paper, LukProt, is an attempt at solving this issue. The database contains protein sequences obtained mostly from genomic, transcriptomic, and metagenomic studies and is an extension of EukProt (Richter DJ, Berney C, Strassert JFH, Poh Y-P, Herman EK, Muñoz-Gómez SA, Wideman JG, Burki F, de Vargas C. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community J. 2022:2:e56. https://doi.org/10.24072/pcjournal.173). LukProt adopts the EukProt naming conventions and includes data from 216 additional animals. The database is associated with a taxonomic grouping (taxogroup) scheme suitable for studying early animal evolution. Minor updates to the database will contain species additions or metadata corrections, whereas major updates will synchronize LukProt to each new version of EukProt, and releases are permanently stored on Zenodo (https://doi.org/10.5281/zenodo.7089120). A BLAST server to search the database is available at: https://lukprot.hirszfeld.pl/. Users are invited to participate in maintaining and correcting LukProt. As it can be searched without downloading locally, the database aims to be a convenient resource not only for evolutionary biologists, but for the broader scientific community as well.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142463297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeshoda Y Harry-Paul, Josianne Lachapelle, Rob W Ness
When environmental change is rapid or unpredictable, phenotypic plasticity can facilitate adaptation to new or stressful environments to promote population persistence long enough for adaptive evolution to occur. However, the underlying genetic mechanisms that contribute to plasticity and its role in adaptive evolution are generally unknown. Two main opposing hypotheses dominate-genetic compensation and genetic assimilation. Here, we predominantly find evidence for genetic compensation over assimilation in adapting the freshwater algae Chlamydomonas reinhardtii to 36 g/L salt environments over 500 generations. More canalized genes in the high-salt (HS) lines displayed a pattern of genetic compensation (63%) fixing near or at the ancestral native expression level, rather than genetic assimilation of the salt-induced level, suggesting that compensation was more common during adaptation to salt. Network analysis revealed an enrichment of genes involved in energy production and salt-resistance processes in HS lines, while an increase in DNA repair mechanisms was seen in ancestral strains. In addition, whole-transcriptome similarity among ancestral and HS lines displayed the evolution of a similar plastic response to salt conditions in independently reared HS lines. We also found more cis-acting regions in the HS lines; however, the expression patterns of most genes did not mimic that of their inherited sequence. Thus, the expression changes induced via plasticity offer temporary relief, but downstream changes are required for a sustainable solution during the evolutionary process.
{"title":"The Evolution of Gene Expression Plasticity During Adaptation to Salt in Chlamydomonas reinhardtii.","authors":"Yeshoda Y Harry-Paul, Josianne Lachapelle, Rob W Ness","doi":"10.1093/gbe/evae214","DOIUrl":"10.1093/gbe/evae214","url":null,"abstract":"<p><p>When environmental change is rapid or unpredictable, phenotypic plasticity can facilitate adaptation to new or stressful environments to promote population persistence long enough for adaptive evolution to occur. However, the underlying genetic mechanisms that contribute to plasticity and its role in adaptive evolution are generally unknown. Two main opposing hypotheses dominate-genetic compensation and genetic assimilation. Here, we predominantly find evidence for genetic compensation over assimilation in adapting the freshwater algae Chlamydomonas reinhardtii to 36 g/L salt environments over 500 generations. More canalized genes in the high-salt (HS) lines displayed a pattern of genetic compensation (63%) fixing near or at the ancestral native expression level, rather than genetic assimilation of the salt-induced level, suggesting that compensation was more common during adaptation to salt. Network analysis revealed an enrichment of genes involved in energy production and salt-resistance processes in HS lines, while an increase in DNA repair mechanisms was seen in ancestral strains. In addition, whole-transcriptome similarity among ancestral and HS lines displayed the evolution of a similar plastic response to salt conditions in independently reared HS lines. We also found more cis-acting regions in the HS lines; however, the expression patterns of most genes did not mimic that of their inherited sequence. Thus, the expression changes induced via plasticity offer temporary relief, but downstream changes are required for a sustainable solution during the evolutionary process.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142389919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob F Warner, Ryan C Range, Jennifer Fenner, Cheikouna Ka, Damien S Waits, Kristen Boddy, Kyle T David, Andrew R Mahon, Kenneth M Halanych
The Antarctic sea urchin Sterechinus neumayeri (Echinoida; Echinidae) is routinely used as a model organism for Antarctic biology. Here, we present a high-quality genome of S. neumayeri. This chromosomal-level assembly was generated using PacBio long-read sequencing and Hi-C chromatin conformation capture sequencing. This 885.3-Mb assembly exhibits high contiguity with a scaffold length N50 of 36.7 Mb assembled into 20 chromosomal length scaffolds. These putative chromosomes exhibit a high degree of synteny compared to other sea urchin models. We used transcript evidence gene modeling combined with sequence homology to identify 21,638 gene models that capture 97.4% of BUSCO orthologs. Among these, we were able to identify and annotate conserved developmental gene regulatory network orthologs, positioning S. neumayeri as a tractable model for comparative studies on evolution and development.
南极海胆 Sterechinus neumayeri(棘皮动物门;棘皮动物科)通常被用作南极生物学的模式生物。在这里,我们展示了 S. neumayeri 的高质量基因组。这个染色体级的基因组是通过 PacBio 长线程测序和 HiC 染色质构象捕获测序生成的。这个 885.3 Mb 的基因组具有很高的连续性,其支架长度 N50 为 36.7 Mb,组装成 20 个染色体长度的支架。与其他海胆模型相比,这些假定染色体表现出高度的同源性。我们利用转录本证据基因建模与序列同源性相结合的方法,确定了 21,638 个基因模型,这些模型捕获了 97.4% 的 BUSCO 同源物。其中,我们能够识别并注释出保守的发育基因调控网络直向同源物,从而将 S. neumayeri 定位为进化和发育比较研究的可操作模型。
{"title":"Chromosomal-Level Genome Assembly of the Antarctic Sea Urchin Sterechinus neumayeri: A Model for Antarctic Invertebrate Biology.","authors":"Jacob F Warner, Ryan C Range, Jennifer Fenner, Cheikouna Ka, Damien S Waits, Kristen Boddy, Kyle T David, Andrew R Mahon, Kenneth M Halanych","doi":"10.1093/gbe/evae237","DOIUrl":"10.1093/gbe/evae237","url":null,"abstract":"<p><p>The Antarctic sea urchin Sterechinus neumayeri (Echinoida; Echinidae) is routinely used as a model organism for Antarctic biology. Here, we present a high-quality genome of S. neumayeri. This chromosomal-level assembly was generated using PacBio long-read sequencing and Hi-C chromatin conformation capture sequencing. This 885.3-Mb assembly exhibits high contiguity with a scaffold length N50 of 36.7 Mb assembled into 20 chromosomal length scaffolds. These putative chromosomes exhibit a high degree of synteny compared to other sea urchin models. We used transcript evidence gene modeling combined with sequence homology to identify 21,638 gene models that capture 97.4% of BUSCO orthologs. Among these, we were able to identify and annotate conserved developmental gene regulatory network orthologs, positioning S. neumayeri as a tractable model for comparative studies on evolution and development.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586663/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonghwan Choi, Taemin Kang, Sun-Jae Park, Seunggwan Shin
Urbanization is a leading factor effecting global biodiversity, driving rapid evolutionary processes in the local biota. Species that adapt and proliferate in city environments can become pests, with human activities facilitating their dispersal and excessive outbreaks. Here we present the first genome data of Plecia longiforceps, a lovebug pest in Eastern Asia with intensive aggregations recently occurring in the Seoul Metropolitan Area of Korea. PacBio HiFi and ONT Pore-C sequencing data were used to construct a highly continuous assembly with a total size of 707 Mb and 8 major pseudochromosomes, its integrity supported by the N50 length of 98.1 Mb and 96.8% BUSCO completeness. Structural and functional annotation using transcriptome data and ab initio predictions revealed a high proportion (69.3%) of repeat sequences, and synteny analysis with Bibio marci showed high levels of genomic collinearity. The genome will serve as an essential resource for both population genomics and molecular research on lovebug dispersal and outbreaks, and also implement studies on the eco-evolutionary processes of insects in urbanizing habitats.
城市化是影响全球生物多样性的一个主要因素,它推动了当地生物群的快速进化过程。在城市环境中适应和繁殖的物种可能会成为害虫,人类活动会促进它们的传播和过度爆发。在这里,我们首次展示了Plecia longiforceps的基因组数据,它是东亚的一种爱虫害虫,最近在韩国首尔大都会区出现了密集聚集。我们利用 PacBio HiFi 和 ONT Pore-C 测序数据构建了一个总大小为 707Mb 和 8 个主要假染色体的高度连续的基因组,其完整性得到了 98.1Mb 的 N50 长度和 96.8% 的 BUSCO 完整性的支持。利用转录组数据和 ab initio 预测进行的结构和功能注释显示,重复序列的比例很高(69.3%),利用 Bibio marci 进行的同源分析显示,基因组的共线性很高。该基因组将成为研究lovebug传播和爆发的种群基因组学和分子研究的重要资源,同时也是研究城市化生境中昆虫生态进化过程的重要依据。
{"title":"A Chromosome-Scale and Annotated Reference Genome Assembly of Plecia longiforceps Duda, 1934 (Diptera: Bibionidae).","authors":"Jonghwan Choi, Taemin Kang, Sun-Jae Park, Seunggwan Shin","doi":"10.1093/gbe/evae205","DOIUrl":"10.1093/gbe/evae205","url":null,"abstract":"<p><p>Urbanization is a leading factor effecting global biodiversity, driving rapid evolutionary processes in the local biota. Species that adapt and proliferate in city environments can become pests, with human activities facilitating their dispersal and excessive outbreaks. Here we present the first genome data of Plecia longiforceps, a lovebug pest in Eastern Asia with intensive aggregations recently occurring in the Seoul Metropolitan Area of Korea. PacBio HiFi and ONT Pore-C sequencing data were used to construct a highly continuous assembly with a total size of 707 Mb and 8 major pseudochromosomes, its integrity supported by the N50 length of 98.1 Mb and 96.8% BUSCO completeness. Structural and functional annotation using transcriptome data and ab initio predictions revealed a high proportion (69.3%) of repeat sequences, and synteny analysis with Bibio marci showed high levels of genomic collinearity. The genome will serve as an essential resource for both population genomics and molecular research on lovebug dispersal and outbreaks, and also implement studies on the eco-evolutionary processes of insects in urbanizing habitats.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11474240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bacteria lose and gain repair genes as they evolve. Here, we investigate the consequences of gain and loss of 11 DNA repair genes across a broad range of bacteria. Using synonymous polymorphisms from bacteria and a set of 50 phylogenetically independent contrasts, we find no evidence that the presence or absence of these 11 genes affects either the overall level of diversity or the pattern of mutation. Using phylogenetic generalized linear squares yields a similar conclusion. It seems likely that the lack of an effect is due to variation in the genetic background and the environment which obscures any effects that the presence or absence of individual genes might have.
细菌在进化过程中会丢失和获得修复基因。在这里,我们研究了 11 个 DNA 修复基因在多种细菌中的增减后果。利用细菌的同义多态性和一组 50 个系统发育上独立的对比,我们发现没有证据表明这 11 个基因的存在与否会影响整体的多样性水平或突变模式。使用系统发育广义线性方程也得出了类似的结论。看来,缺乏影响的原因可能是遗传背景和环境的变化掩盖了单个基因存在与否可能产生的任何影响。
{"title":"The Effect of the Presence and Absence of DNA Repair Genes on the Rate and Pattern of Mutation in Bacteria.","authors":"Georgios Kalogiannis, Adam Eyre-Walker","doi":"10.1093/gbe/evae216","DOIUrl":"10.1093/gbe/evae216","url":null,"abstract":"<p><p>Bacteria lose and gain repair genes as they evolve. Here, we investigate the consequences of gain and loss of 11 DNA repair genes across a broad range of bacteria. Using synonymous polymorphisms from bacteria and a set of 50 phylogenetically independent contrasts, we find no evidence that the presence or absence of these 11 genes affects either the overall level of diversity or the pattern of mutation. Using phylogenetic generalized linear squares yields a similar conclusion. It seems likely that the lack of an effect is due to variation in the genetic background and the environment which obscures any effects that the presence or absence of individual genes might have.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142389918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suha Naser-Khdour, Fabian Scheuber, Peter D Fields, Dieter Ebert
Genomic regions that play a role in parasite defense are often found to be highly variable, with the major histocompatibility complex serving as an iconic example. Single nucleotide polymorphisms may represent only a small portion of this variability, with Indel polymorphisms and copy number variation further contributing. In extreme cases, haplotypes may no longer be recognized as orthologous. Understanding the evolution of such highly divergent regions is challenging because the most extreme variation is not visible using reference-assisted genomic approaches. Here we analyze the case of the Pasteuria Resistance Complex in the crustacean Daphnia magna, a defense complex in the host against the common and virulent bacterium Pasteuria ramosa. Two haplotypes of this region have been previously described, with parts of it being nonhomologous, and the region has been shown to be under balancing selection. Using pan-genome analysis and tree reconciliation methods to explore the evolution of the Pasteuria Resistance Complex and its characteristics within and between species of Daphnia and other Cladoceran species, our analysis revealed a remarkable diversity in this region even among host species, with many nonhomologous hyper-divergent haplotypes. The Pasteuria Resistance Complex is characterized by extensive duplication and losses of Fucosyltransferase (FuT) and Galactosyltransferase (GalT) genes that are believed to play a role in parasite defense. The Pasteuria Resistance Complex region can be traced back to common ancestors over 250 million years. The unique combination of an ancient resistance complex and a dynamic, hyper-divergent genomic environment presents a fascinating opportunity to investigate the role of such regions in the evolution and long-term maintenance of resistance polymorphisms. Our findings offer valuable insights into the evolutionary forces shaping disease resistance and adaptation, not only in the genus Daphnia, but potentially across the entire Cladocera class.
{"title":"The Evolution of Extreme Genetic Variability in a Parasite-Resistance Complex.","authors":"Suha Naser-Khdour, Fabian Scheuber, Peter D Fields, Dieter Ebert","doi":"10.1093/gbe/evae222","DOIUrl":"10.1093/gbe/evae222","url":null,"abstract":"<p><p>Genomic regions that play a role in parasite defense are often found to be highly variable, with the major histocompatibility complex serving as an iconic example. Single nucleotide polymorphisms may represent only a small portion of this variability, with Indel polymorphisms and copy number variation further contributing. In extreme cases, haplotypes may no longer be recognized as orthologous. Understanding the evolution of such highly divergent regions is challenging because the most extreme variation is not visible using reference-assisted genomic approaches. Here we analyze the case of the Pasteuria Resistance Complex in the crustacean Daphnia magna, a defense complex in the host against the common and virulent bacterium Pasteuria ramosa. Two haplotypes of this region have been previously described, with parts of it being nonhomologous, and the region has been shown to be under balancing selection. Using pan-genome analysis and tree reconciliation methods to explore the evolution of the Pasteuria Resistance Complex and its characteristics within and between species of Daphnia and other Cladoceran species, our analysis revealed a remarkable diversity in this region even among host species, with many nonhomologous hyper-divergent haplotypes. The Pasteuria Resistance Complex is characterized by extensive duplication and losses of Fucosyltransferase (FuT) and Galactosyltransferase (GalT) genes that are believed to play a role in parasite defense. The Pasteuria Resistance Complex region can be traced back to common ancestors over 250 million years. The unique combination of an ancient resistance complex and a dynamic, hyper-divergent genomic environment presents a fascinating opportunity to investigate the role of such regions in the evolution and long-term maintenance of resistance polymorphisms. Our findings offer valuable insights into the evolutionary forces shaping disease resistance and adaptation, not only in the genus Daphnia, but potentially across the entire Cladocera class.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500718/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142400021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henrique Moura Dias, Naiara Almeida de Toledo, Ravi V Mural, James C Schnable, Marie-Anne Van Sluys
Molecular evolution analysis typically involves identifying selection pressure and reconstructing evolutionary trends. This process usually requires access to specific data related to a target gene or gene family within a particular group of organisms. While recent advancements in high-throughput sequencing techniques have resulted in the rapid accumulation of extensive genomics and transcriptomics data and the creation of new databases in public repositories, extracting valuable insights from such vast data sets remains a significant challenge for researchers. Here, we elucidated the evolutionary history of THI1, a gene responsible for encoding thiamine thiazole synthase. The thiazole ring is a precursor for vitamin B1 and a crucial cofactor in primary metabolic pathways. A thorough search of complete genomes available within public repositories reveals 702 THI1 homologs of Archaea and Eukarya. Throughout its diversification, the plant lineage has preserved the THI1 gene by incorporating the N-terminus and targeting the chloroplasts. Likewise, evolutionary pressures and lifestyle appear to be associated with retention of TPP riboswitch sites and consequent dual posttranscriptional regulation of the de novo biosynthesis pathway in basal groups. Multicopy retention of THI1 is not a typical plant pattern, even after successive genome duplications. Examining cis-regulatory sites in plants uncovers two shared motifs across all plant lineages. A data mining of 484 transcriptome data sets supports the THI1 homolog expression under a light/dark cycle response and a tissue-specific pattern. Finally, the work presented brings a new look at public repositories as an opportunity to explore evolutionary trends to THI1.
{"title":"THI1 Gene Evolutionary Trends: A Comprehensive Plant-Focused Assessment via Data Mining and Large-Scale Analysis.","authors":"Henrique Moura Dias, Naiara Almeida de Toledo, Ravi V Mural, James C Schnable, Marie-Anne Van Sluys","doi":"10.1093/gbe/evae212","DOIUrl":"10.1093/gbe/evae212","url":null,"abstract":"<p><p>Molecular evolution analysis typically involves identifying selection pressure and reconstructing evolutionary trends. This process usually requires access to specific data related to a target gene or gene family within a particular group of organisms. While recent advancements in high-throughput sequencing techniques have resulted in the rapid accumulation of extensive genomics and transcriptomics data and the creation of new databases in public repositories, extracting valuable insights from such vast data sets remains a significant challenge for researchers. Here, we elucidated the evolutionary history of THI1, a gene responsible for encoding thiamine thiazole synthase. The thiazole ring is a precursor for vitamin B1 and a crucial cofactor in primary metabolic pathways. A thorough search of complete genomes available within public repositories reveals 702 THI1 homologs of Archaea and Eukarya. Throughout its diversification, the plant lineage has preserved the THI1 gene by incorporating the N-terminus and targeting the chloroplasts. Likewise, evolutionary pressures and lifestyle appear to be associated with retention of TPP riboswitch sites and consequent dual posttranscriptional regulation of the de novo biosynthesis pathway in basal groups. Multicopy retention of THI1 is not a typical plant pattern, even after successive genome duplications. Examining cis-regulatory sites in plants uncovers two shared motifs across all plant lineages. A data mining of 484 transcriptome data sets supports the THI1 homolog expression under a light/dark cycle response and a tissue-specific pattern. Finally, the work presented brings a new look at public repositories as an opportunity to explore evolutionary trends to THI1.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11521341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142463301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The multigene family of the major histocompatibility complex (MHC) codes for the key antigen-presenting molecules of the vertebrate immune system. In birds, duplicated MHC class II (MHC-II) genes are highly homogenized by concerted evolution, and thus, identification of their orthologous relationships across long evolutionary timescales remains challenging. Relatively low evolutionary rate of avian MHC class IIA genes has been expected to provide a promising avenue to allow such inferences, but availability of MHC-IIA sequences in nonmodel bird species has been limited until recently. Here, taking advantage from accumulating genomic resources, we identified and analyzed MHC-IIA sequences from the most basal lineage of extant birds (Palaeognathae). Conserved region of the MHC-IIA membrane-proximal domain was used to search for orthologous relationships between palaeognath birds and nonavian reptiles. First, analyses of palaeognath sequences revealed the presence of a separate MHC-IIA gene lineage (DAA3) in kiwis, which did not cluster with previously described avian MHC-IIA lineages (DAA1 and DAA2). Next, phylogenetic reconstruction showed that kiwi DAA3 sequences form a single well-supported cluster with turtle MHC-IIA. High similarity of these sequences most likely reflects their remarkable evolutionary conservation and retention of ancient orthologous relationships, which can be traced back to basal archosauromorphs ca. 250 million years ago. Our analyses offer novel insights into macroevolutionary history of the MHC and reinforce the view that rapid accumulation of high-quality genome assemblies across divergent nonmodel species can substantially advance our understanding of gene evolution.
{"title":"Palaeognaths Reveal Evolutionary Ancestry of the Avian Major Histocompatibility Complex Class II.","authors":"Piotr Minias, Wiesław Babik","doi":"10.1093/gbe/evae211","DOIUrl":"10.1093/gbe/evae211","url":null,"abstract":"<p><p>The multigene family of the major histocompatibility complex (MHC) codes for the key antigen-presenting molecules of the vertebrate immune system. In birds, duplicated MHC class II (MHC-II) genes are highly homogenized by concerted evolution, and thus, identification of their orthologous relationships across long evolutionary timescales remains challenging. Relatively low evolutionary rate of avian MHC class IIA genes has been expected to provide a promising avenue to allow such inferences, but availability of MHC-IIA sequences in nonmodel bird species has been limited until recently. Here, taking advantage from accumulating genomic resources, we identified and analyzed MHC-IIA sequences from the most basal lineage of extant birds (Palaeognathae). Conserved region of the MHC-IIA membrane-proximal domain was used to search for orthologous relationships between palaeognath birds and nonavian reptiles. First, analyses of palaeognath sequences revealed the presence of a separate MHC-IIA gene lineage (DAA3) in kiwis, which did not cluster with previously described avian MHC-IIA lineages (DAA1 and DAA2). Next, phylogenetic reconstruction showed that kiwi DAA3 sequences form a single well-supported cluster with turtle MHC-IIA. High similarity of these sequences most likely reflects their remarkable evolutionary conservation and retention of ancient orthologous relationships, which can be traced back to basal archosauromorphs ca. 250 million years ago. Our analyses offer novel insights into macroevolutionary history of the MHC and reinforce the view that rapid accumulation of high-quality genome assemblies across divergent nonmodel species can substantially advance our understanding of gene evolution.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487930/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}