Akash Ajay , Tina Begum , Ajay Arya , Krishan Kumar , Shandar Ahmad
{"title":"Global and local genomic features together modulate the spontaneous single nucleotide mutation rate","authors":"Akash Ajay , Tina Begum , Ajay Arya , Krishan Kumar , Shandar Ahmad","doi":"10.1016/j.compbiolchem.2024.108107","DOIUrl":null,"url":null,"abstract":"<div><p>Spontaneous mutations are evolutionary engines as they generate variants for the evolutionary downstream processes that give rise to speciation and adaptation. Single nucleotide mutations (SNM) are the most abundant type of mutations among them. Here, we perform a meta-analysis to quantify the influence of selected global genomic parameters (genome size, genomic GC content, genomic repeat fraction, number of coding genes, gene count, and strand bias in prokaryotes) and local genomic features (local GC content, repeat content, CpG content and the number of SNM at CpG islands) on spontaneous SNM rates across the tree of life (prokaryotes, unicellular eukaryotes, multicellular eukaryotes) using wild-type sequence data in two different taxon classification systems. We find that the spontaneous SNM rates in our data are correlated with many genomic features in prokaryotes and unicellular eukaryotes irrespective of their sample sizes. On the other hand, only the number of coding genes was correlated with the spontaneous SNM rates in multicellular eukaryotes primarily contributed by vertebrates data. Considering local features, we notice that local GC content and CpG content significantly were correlated with the spontaneous SNM rates in the unicellular eukaryotes, while local repeat fraction is an important feature in prokaryotes and certain specific uni- and multi-cellular eukaryotes. Such predictive features of the spontaneous SNM rates often support non-linear models as the best fit compared to the linear model. We also observe that the strand asymmetry in prokaryotes plays an important role in determining the spontaneous SNM rates but the SNM spectrum does not.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927124000951","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Spontaneous mutations are evolutionary engines as they generate variants for the evolutionary downstream processes that give rise to speciation and adaptation. Single nucleotide mutations (SNM) are the most abundant type of mutations among them. Here, we perform a meta-analysis to quantify the influence of selected global genomic parameters (genome size, genomic GC content, genomic repeat fraction, number of coding genes, gene count, and strand bias in prokaryotes) and local genomic features (local GC content, repeat content, CpG content and the number of SNM at CpG islands) on spontaneous SNM rates across the tree of life (prokaryotes, unicellular eukaryotes, multicellular eukaryotes) using wild-type sequence data in two different taxon classification systems. We find that the spontaneous SNM rates in our data are correlated with many genomic features in prokaryotes and unicellular eukaryotes irrespective of their sample sizes. On the other hand, only the number of coding genes was correlated with the spontaneous SNM rates in multicellular eukaryotes primarily contributed by vertebrates data. Considering local features, we notice that local GC content and CpG content significantly were correlated with the spontaneous SNM rates in the unicellular eukaryotes, while local repeat fraction is an important feature in prokaryotes and certain specific uni- and multi-cellular eukaryotes. Such predictive features of the spontaneous SNM rates often support non-linear models as the best fit compared to the linear model. We also observe that the strand asymmetry in prokaryotes plays an important role in determining the spontaneous SNM rates but the SNM spectrum does not.
期刊介绍:
Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered.
Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.