Weiyi Li, Stephan Baehr, Michelle Marasco, Lauren Reyes, Danielle Brister, Craig S Pikaard, Jean-Francois Gout, Marc Vermulst, Michael Lynch
{"title":"A Narrow Range of Transcript-error Rates Across the Tree of Life.","authors":"Weiyi Li, Stephan Baehr, Michelle Marasco, Lauren Reyes, Danielle Brister, Craig S Pikaard, Jean-Francois Gout, Marc Vermulst, Michael Lynch","doi":"10.1101/2023.05.02.538944","DOIUrl":null,"url":null,"abstract":"<p><p>The expression of genomically-encoded information is not error-free. Transcript-error rates are dramatically higher than DNA-level mutation rates, and despite their transient nature, the steady-state load of such errors must impose some burden on cellular performance. However, a broad perspective on the degree to which transcript-error rates are constrained by natural selection and diverge among lineages remains to be developed. Here, we present a genome-wide analysis of transcript-error rates across the Tree of Life using a modified rolling-circle sequencing method, revealing that the range in error rates is remarkably narrow across diverse species. Transcript errors tend to be randomly distributed, with little evidence supporting local control of error rates associated with gene-expression levels. A majority of transcript errors result in missense errors if translated, and as with a fraction of nonsense transcript errors, these are underrepresented relative to random expectations, suggesting the existence of mechanisms for purging some such errors. To quantitatively understand how natural selection and random genetic drift might shape transcript-error rates across species, we present a model based on cell biology and population genetics, incorporating information on cell volume, proteome size, average degree of exposure of individual errors, and effective population size. However, while this model provides a framework for understanding the evolution of this highly conserved trait, as currently structured it explains only 20% of the variation in the data, suggesting a need for further theoretical work in this area.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11761650/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.05.02.538944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The expression of genomically-encoded information is not error-free. Transcript-error rates are dramatically higher than DNA-level mutation rates, and despite their transient nature, the steady-state load of such errors must impose some burden on cellular performance. However, a broad perspective on the degree to which transcript-error rates are constrained by natural selection and diverge among lineages remains to be developed. Here, we present a genome-wide analysis of transcript-error rates across the Tree of Life using a modified rolling-circle sequencing method, revealing that the range in error rates is remarkably narrow across diverse species. Transcript errors tend to be randomly distributed, with little evidence supporting local control of error rates associated with gene-expression levels. A majority of transcript errors result in missense errors if translated, and as with a fraction of nonsense transcript errors, these are underrepresented relative to random expectations, suggesting the existence of mechanisms for purging some such errors. To quantitatively understand how natural selection and random genetic drift might shape transcript-error rates across species, we present a model based on cell biology and population genetics, incorporating information on cell volume, proteome size, average degree of exposure of individual errors, and effective population size. However, while this model provides a framework for understanding the evolution of this highly conserved trait, as currently structured it explains only 20% of the variation in the data, suggesting a need for further theoretical work in this area.