Benjamin D Redelings, Ian Holmes, Gerton Lunter, Tal Pupko, Maria Anisimova
Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations. Yet, despite their importance, evolutionary studies have traditionally ignored or mishandled insertions and deletions due to a lack of comprehensive methodologies and statistical models of insertions and deletion dynamics. Here, we discuss methods for describing insertions and deletion variation and modeling insertions and deletions over evolutionary time. We provide practical advice for tackling insertions and deletions in genomic sequences and illustrate our discussion with examples of insertions and deletion-induced effects in human and other natural populations and their contribution to evolutionary processes. We outline promising directions for future developments in statistical methodologies that would allow researchers to analyze insertions and deletion variation and their effects in large genomic data sets and to incorporate insertions and deletions in evolutionary inference.
{"title":"Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications.","authors":"Benjamin D Redelings, Ian Holmes, Gerton Lunter, Tal Pupko, Maria Anisimova","doi":"10.1093/molbev/msae177","DOIUrl":"10.1093/molbev/msae177","url":null,"abstract":"<p><p>Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations. Yet, despite their importance, evolutionary studies have traditionally ignored or mishandled insertions and deletions due to a lack of comprehensive methodologies and statistical models of insertions and deletion dynamics. Here, we discuss methods for describing insertions and deletion variation and modeling insertions and deletions over evolutionary time. We provide practical advice for tackling insertions and deletions in genomic sequences and illustrate our discussion with examples of insertions and deletion-induced effects in human and other natural populations and their contribution to evolutionary processes. We outline promising directions for future developments in statistical methodologies that would allow researchers to analyze insertions and deletion variation and their effects in large genomic data sets and to incorporate insertions and deletions in evolutionary inference.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11385596/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142036386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John B McAuley, Bertrand Servin, Hamish A Burnett, Cathrine Brekke, Lucy Peters, Ingerid J Hagen, Alina K Niskanen, Thor Harald Ringsby, Arild Husby, Henrik Jensen, Susan E Johnston
Meiotic recombination through chromosomal crossing-over is a fundamental feature of sex and an important driver of genomic diversity. It ensures proper disjunction, allows increased selection responses, and prevents mutation accumulation; however, it is also mutagenic and can break up favorable haplotypes. This cost-benefit dynamic is likely to vary depending on mechanistic and evolutionary contexts, and indeed, recombination rates show huge variation in nature. Identifying the genetic architecture of this variation is key to understanding its causes and consequences. Here, we investigate individual recombination rate variation in wild house sparrows (Passer domesticus). We integrate genomic and pedigree data to identify autosomal crossover counts (ACCs) and intrachromosomal allelic shuffling (r¯intra) in 13,056 gametes transmitted from 2,653 individuals to their offspring. Females had 1.37 times higher ACC, and 1.55 times higher r¯intra than males. ACC and r¯intra were heritable in females and males (ACC h2 = 0.23 and 0.11; r¯intra h2 = 0.12 and 0.14), but cross-sex additive genetic correlations were low (rA = 0.29 and 0.32 for ACC and r¯intra). Conditional bivariate analyses showed that all measures remained heritable after accounting for genetic values in the opposite sex, indicating that sex-specific ACC and r¯intra can evolve somewhat independently. Genome-wide models showed that ACC and r¯intra are polygenic and driven by many small-effect loci, many of which are likely to act in trans as global recombination modifiers. Our findings show that recombination rates of females and males can have different evolutionary potential in wild birds, providing a compelling mechanism for the evolution of sexual dimorphism in recombination.
{"title":"The Genetic Architecture of Recombination Rates is Polygenic and Differs Between the Sexes in Wild House Sparrows (Passer domesticus).","authors":"John B McAuley, Bertrand Servin, Hamish A Burnett, Cathrine Brekke, Lucy Peters, Ingerid J Hagen, Alina K Niskanen, Thor Harald Ringsby, Arild Husby, Henrik Jensen, Susan E Johnston","doi":"10.1093/molbev/msae179","DOIUrl":"10.1093/molbev/msae179","url":null,"abstract":"<p><p>Meiotic recombination through chromosomal crossing-over is a fundamental feature of sex and an important driver of genomic diversity. It ensures proper disjunction, allows increased selection responses, and prevents mutation accumulation; however, it is also mutagenic and can break up favorable haplotypes. This cost-benefit dynamic is likely to vary depending on mechanistic and evolutionary contexts, and indeed, recombination rates show huge variation in nature. Identifying the genetic architecture of this variation is key to understanding its causes and consequences. Here, we investigate individual recombination rate variation in wild house sparrows (Passer domesticus). We integrate genomic and pedigree data to identify autosomal crossover counts (ACCs) and intrachromosomal allelic shuffling (r¯intra) in 13,056 gametes transmitted from 2,653 individuals to their offspring. Females had 1.37 times higher ACC, and 1.55 times higher r¯intra than males. ACC and r¯intra were heritable in females and males (ACC h2 = 0.23 and 0.11; r¯intra h2 = 0.12 and 0.14), but cross-sex additive genetic correlations were low (rA = 0.29 and 0.32 for ACC and r¯intra). Conditional bivariate analyses showed that all measures remained heritable after accounting for genetic values in the opposite sex, indicating that sex-specific ACC and r¯intra can evolve somewhat independently. Genome-wide models showed that ACC and r¯intra are polygenic and driven by many small-effect loci, many of which are likely to act in trans as global recombination modifiers. Our findings show that recombination rates of females and males can have different evolutionary potential in wild birds, providing a compelling mechanism for the evolution of sexual dimorphism in recombination.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11385585/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142056094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin C Klementz, Georg Brenneis, Isaac A Hinne, Ethan M Laumer, Sophie M Neu, Grace M Hareid, Guilherme Gainett, Emily V W Setton, Catalina Simian, David E Vrech, Isabella Joyce, Austen A Barnett, Nipam H Patel, Mark S Harvey, Alfredo V Peretti, Monika Gulia-Nuss, Prashant P Sharma
Neofunctionalization of duplicated gene copies is thought to be an important process underlying the origin of evolutionary novelty and provides an elegant mechanism for the origin of new phenotypic traits. One putative case where a new gene copy has been linked to a novel morphological trait is the origin of the arachnid patella, a taxonomically restricted leg segment. In spiders, the origin of this segment has been linked to the origin of the paralog dachshund-2, suggesting that a new gene facilitated the expression of a new trait. However, various arachnid groups that possess patellae do not have a copy of dachshund-2, disfavoring the direct link between gene origin and trait origin. We investigated the developmental genetic basis for patellar patterning in the harvestman Phalangium opilio, which lacks dachshund-2. Here, we show that the harvestman patella is established by a novel expression domain of the transcription factor extradenticle. Leveraging this definition of patellar identity, we surveyed targeted groups across chelicerate phylogeny to assess when this trait evolved. We show that a patellar homolog is present in Pycnogonida (sea spiders) and various arachnid orders, suggesting a single origin of the patella in the ancestor of Chelicerata. A potential loss of the patella is observed in Ixodida. Our results suggest that the modification of an ancient gene, rather than the neofunctionalization of a new gene copy, underlies the origin of the patella. Broadly, this work underscores the value of comparative data and broad taxonomic sampling when testing hypotheses in evolutionary developmental biology.
{"title":"A Novel Expression Domain of extradenticle Underlies the Evolutionary Developmental Origin of the Chelicerate Patella.","authors":"Benjamin C Klementz, Georg Brenneis, Isaac A Hinne, Ethan M Laumer, Sophie M Neu, Grace M Hareid, Guilherme Gainett, Emily V W Setton, Catalina Simian, David E Vrech, Isabella Joyce, Austen A Barnett, Nipam H Patel, Mark S Harvey, Alfredo V Peretti, Monika Gulia-Nuss, Prashant P Sharma","doi":"10.1093/molbev/msae188","DOIUrl":"10.1093/molbev/msae188","url":null,"abstract":"<p><p>Neofunctionalization of duplicated gene copies is thought to be an important process underlying the origin of evolutionary novelty and provides an elegant mechanism for the origin of new phenotypic traits. One putative case where a new gene copy has been linked to a novel morphological trait is the origin of the arachnid patella, a taxonomically restricted leg segment. In spiders, the origin of this segment has been linked to the origin of the paralog dachshund-2, suggesting that a new gene facilitated the expression of a new trait. However, various arachnid groups that possess patellae do not have a copy of dachshund-2, disfavoring the direct link between gene origin and trait origin. We investigated the developmental genetic basis for patellar patterning in the harvestman Phalangium opilio, which lacks dachshund-2. Here, we show that the harvestman patella is established by a novel expression domain of the transcription factor extradenticle. Leveraging this definition of patellar identity, we surveyed targeted groups across chelicerate phylogeny to assess when this trait evolved. We show that a patellar homolog is present in Pycnogonida (sea spiders) and various arachnid orders, suggesting a single origin of the patella in the ancestor of Chelicerata. A potential loss of the patella is observed in Ixodida. Our results suggest that the modification of an ancient gene, rather than the neofunctionalization of a new gene copy, underlies the origin of the patella. Broadly, this work underscores the value of comparative data and broad taxonomic sampling when testing hypotheses in evolutionary developmental biology.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11422720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142133203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea González-González, Tiffany N Batarseh, Alejandra Rodríguez-Verdugo, Brandon S Gaut
Epistasis is caused by genetic interactions among mutations that affect fitness. To characterize properties and potential mechanisms of epistasis, we engineered eight double mutants that combined mutations from the rho and rpoB genes of Escherichia coli. The two genes encode essential functions for transcription, and the mutations in each gene were chosen because they were beneficial for adaptation to thermal stress (42.2 °C). The double mutants exhibited patterns of fitness epistasis that included diminishing returns epistasis at 42.2 °C, stronger diminishing returns between mutations with larger beneficial effects and both negative and positive (sign) epistasis across environments (20.0 °C and 37.0 °C). By assessing gene expression between single and double mutants, we detected hundreds of genes with gene expression epistasis. Previous work postulated that highly connected hub genes in coexpression networks have low epistasis, but we found the opposite: hub genes had high epistasis values in both coexpression and protein-protein interaction networks. We hypothesized that elevated epistasis in hub genes reflected that they were enriched for targets of Rho termination but that was not the case. Altogether, gene expression and coexpression analyses revealed that thermal adaptation occurred in modules, through modulation of ribonucleotide biosynthetic processes and ribosome assembly, the attenuation of expression in genes related to heat shock and stress responses, and with an overall trend toward restoring gene expression toward the unstressed state.
{"title":"Patterns of Fitness and Gene Expression Epistasis Generated by Beneficial Mutations in the rho and rpoB Genes of Escherichia coli during High-Temperature Adaptation.","authors":"Andrea González-González, Tiffany N Batarseh, Alejandra Rodríguez-Verdugo, Brandon S Gaut","doi":"10.1093/molbev/msae187","DOIUrl":"10.1093/molbev/msae187","url":null,"abstract":"<p><p>Epistasis is caused by genetic interactions among mutations that affect fitness. To characterize properties and potential mechanisms of epistasis, we engineered eight double mutants that combined mutations from the rho and rpoB genes of Escherichia coli. The two genes encode essential functions for transcription, and the mutations in each gene were chosen because they were beneficial for adaptation to thermal stress (42.2 °C). The double mutants exhibited patterns of fitness epistasis that included diminishing returns epistasis at 42.2 °C, stronger diminishing returns between mutations with larger beneficial effects and both negative and positive (sign) epistasis across environments (20.0 °C and 37.0 °C). By assessing gene expression between single and double mutants, we detected hundreds of genes with gene expression epistasis. Previous work postulated that highly connected hub genes in coexpression networks have low epistasis, but we found the opposite: hub genes had high epistasis values in both coexpression and protein-protein interaction networks. We hypothesized that elevated epistasis in hub genes reflected that they were enriched for targets of Rho termination but that was not the case. Altogether, gene expression and coexpression analyses revealed that thermal adaptation occurred in modules, through modulation of ribonucleotide biosynthetic processes and ribosome assembly, the attenuation of expression in genes related to heat shock and stress responses, and with an overall trend toward restoring gene expression toward the unstressed state.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11414761/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142133204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epigenetics research in evolutionary biology encompasses a variety of research areas, from regulation of gene expression to inheritance of environmentally mediated phenotypes. Such divergent research foci can occasionally render the umbrella term "epigenetics" ambiguous. Here I discuss several areas of contemporary epigenetics research in the context of evolutionary biology, aiming to provide balanced views across timescales and molecular mechanisms. The importance of epigenetics in development is now being assessed in many nonmodel species. These studies not only confirm the importance of epigenetic marks in developmental processes, but also highlight the significant diversity in epigenetic regulatory mechanisms across taxa. Further, these comparative epigenomic studies have begun to show promise toward enhancing our understanding of how regulatory programs evolve. A key property of epigenetic marks is that they can be inherited along mitotic cell lineages, and epigenetic differences that occur during early development can have lasting consequences on the organismal phenotypes. Thus, epigenetic marks may play roles in short-term (within an organism's lifetime or to the next generation) adaptation and phenotypic plasticity. However, the extent to which observed epigenetic variation occurs independently of genetic influences remains uncertain, due to the widespread impact of genetics on epigenetic variation and the limited availability of comprehensive (epi)genomic resources from most species. While epigenetic marks can be inherited independently of genetic sequences in some species, there is little evidence that such "transgenerational inheritance" is a general phenomenon. Rather, molecular mechanisms of epigenetic inheritance are highly variable between species.
{"title":"Epigenetics Research in Evolutionary Biology: Perspectives on Timescales and Mechanisms.","authors":"Soojin V Yi","doi":"10.1093/molbev/msae170","DOIUrl":"10.1093/molbev/msae170","url":null,"abstract":"<p><p>Epigenetics research in evolutionary biology encompasses a variety of research areas, from regulation of gene expression to inheritance of environmentally mediated phenotypes. Such divergent research foci can occasionally render the umbrella term \"epigenetics\" ambiguous. Here I discuss several areas of contemporary epigenetics research in the context of evolutionary biology, aiming to provide balanced views across timescales and molecular mechanisms. The importance of epigenetics in development is now being assessed in many nonmodel species. These studies not only confirm the importance of epigenetic marks in developmental processes, but also highlight the significant diversity in epigenetic regulatory mechanisms across taxa. Further, these comparative epigenomic studies have begun to show promise toward enhancing our understanding of how regulatory programs evolve. A key property of epigenetic marks is that they can be inherited along mitotic cell lineages, and epigenetic differences that occur during early development can have lasting consequences on the organismal phenotypes. Thus, epigenetic marks may play roles in short-term (within an organism's lifetime or to the next generation) adaptation and phenotypic plasticity. However, the extent to which observed epigenetic variation occurs independently of genetic influences remains uncertain, due to the widespread impact of genetics on epigenetic variation and the limited availability of comprehensive (epi)genomic resources from most species. While epigenetic marks can be inherited independently of genetic sequences in some species, there is little evidence that such \"transgenerational inheritance\" is a general phenomenon. Rather, molecular mechanisms of epigenetic inheritance are highly variable between species.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":"41 9","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11376073/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142133205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emma F Harding, Lewis K Mercer, Grace J H Yan, Paul D Waters, Peter A White
Retroviruses are an ancient viral family that have globally coevolved with vertebrates and impacted their evolution. In Australia, a continent that has been geographically isolated for millions of years, little is known about retroviruses in wildlife, despite the devastating impacts of a retrovirus on endangered koala populations. We therefore sought to identify and characterize Australian retroviruses through reconstruction of endogenous retroviruses from marsupial genomes, in particular the Tasmanian devil due to its high cancer incidence. We screened 19 marsupial genomes and identified over 80,000 endogenous retrovirus fragments which we classified into eight retrovirus clades. The retroviruses were similar to either Betaretrovirus (5/8) or Gammaretrovirus (3/8) retroviruses, but formed distinct phylogenetic clades compared to extant retroviruses. One of the clades (MEBrv 3) lost an envelope but retained retrotranspositional activity, subsequently amplifying throughout all Dasyuridae genomes. Overall, we provide insights into Australian retrovirus evolution and identify a highly active endogenous retrovirus within Dasyuridae genomes.
{"title":"Invasion and Amplification of Endogenous Retroviruses in Dasyuridae Marsupial Genomes.","authors":"Emma F Harding, Lewis K Mercer, Grace J H Yan, Paul D Waters, Peter A White","doi":"10.1093/molbev/msae160","DOIUrl":"10.1093/molbev/msae160","url":null,"abstract":"<p><p>Retroviruses are an ancient viral family that have globally coevolved with vertebrates and impacted their evolution. In Australia, a continent that has been geographically isolated for millions of years, little is known about retroviruses in wildlife, despite the devastating impacts of a retrovirus on endangered koala populations. We therefore sought to identify and characterize Australian retroviruses through reconstruction of endogenous retroviruses from marsupial genomes, in particular the Tasmanian devil due to its high cancer incidence. We screened 19 marsupial genomes and identified over 80,000 endogenous retrovirus fragments which we classified into eight retrovirus clades. The retroviruses were similar to either Betaretrovirus (5/8) or Gammaretrovirus (3/8) retroviruses, but formed distinct phylogenetic clades compared to extant retroviruses. One of the clades (MEBrv 3) lost an envelope but retained retrotranspositional activity, subsequently amplifying throughout all Dasyuridae genomes. Overall, we provide insights into Australian retrovirus evolution and identify a highly active endogenous retrovirus within Dasyuridae genomes.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11334065/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141889758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Several mammalian genes have originated from the domestication of retrotransposons, selfish mobile elements related to retroviruses. Some of the proteins encoded by these genes have maintained virus-like features; including self-processing, capsid structure formation, and the generation of different isoforms through -1 programmed ribosomal frameshifting. Using quantitative approaches in molecular evolution and biophysical analyses, we studied 28 retrotransposon-derived genes, with a focus on the evolution of virus-like features. By analyzing the rate of synonymous substitutions, we show that the -1 programmed ribosomal frameshifting mechanism in three of these genes (PEG10, PNMA3, and PNMA5) is conserved across mammals and originates alternative proteins. These genes were targets of positive selection in primates, and one of the positively selected sites affects a B-cell epitope on the spike domain of the PNMA5 capsid, a finding reminiscent of observations in infectious viruses. More generally, we found that retrotransposon-derived proteins vary in their intrinsically disordered region content and this is directly associated with their evolutionary rates. Most positively selected sites in these proteins are located in intrinsically disordered regions and some of them impact protein posttranslational modifications, such as autocleavage and phosphorylation. Detailed analyses of the biophysical properties of intrinsically disordered regions showed that positive selection preferentially targeted regions with lower conformational entropy. Furthermore, positive selection introduces variation in binary sequence patterns across orthologues, as well as in chain compaction. Our results shed light on the evolutionary trajectories of a unique class of mammalian genes and suggest a novel approach to study how intrinsically disordered region biophysical characteristics are affected by evolution.
哺乳动物的一些基因来源于逆转录病毒的驯化,逆转录病毒是一种与逆转录病毒有关的自私的移动元素。这些基因编码的一些蛋白质保持了类似病毒的特征,包括自我处理、形成囊膜结构,以及通过-1程序核糖体框架转换产生不同的异构体。利用分子进化和生物物理分析的定量方法,我们研究了 28 个反转座子衍生基因,重点关注病毒样特征的进化。通过分析同义替换率,我们发现其中三个基因(PEG10、PNMA3 和 PNMA5)的-1 程序化核糖体框架转换机制在哺乳动物中是保守的,并产生了替代蛋白。这些基因在灵长类动物中是正选择的目标,其中一个正选择位点影响了 PNMA5 包囊尖峰结构域上的 B 细胞表位,这一发现让人联想到在传染性病毒中的观察结果。更广泛地说,我们发现逆转录病毒载体衍生的蛋白质在其内在无序区的含量上各不相同,这与其进化速度直接相关。这些蛋白质中的大多数正选择位点都位于内在无序区,其中一些位点会影响蛋白质的翻译后修饰,如自裂解和磷酸化。对内在无序区生物物理特性的详细分析显示,正向选择优先针对构象熵较低的区域。此外,正选择还引入了同源物之间二元序列模式的变异以及链的压缩。我们的研究结果揭示了一类独特的哺乳动物基因的进化轨迹,并提出了一种研究内在无序区生物物理特征如何受进化影响的新方法。
{"title":"Evolution of Virus-like Features and Intrinsically Disordered Regions in Retrotransposon-derived Mammalian Genes.","authors":"Rachele Cagliani, Diego Forni, Alessandra Mozzi, Rotem Fuchs, Dafna Tussia-Cohen, Federica Arrigoni, Uberto Pozzoli, Luca De Gioia, Tzachi Hagai, Manuela Sironi","doi":"10.1093/molbev/msae154","DOIUrl":"10.1093/molbev/msae154","url":null,"abstract":"<p><p>Several mammalian genes have originated from the domestication of retrotransposons, selfish mobile elements related to retroviruses. Some of the proteins encoded by these genes have maintained virus-like features; including self-processing, capsid structure formation, and the generation of different isoforms through -1 programmed ribosomal frameshifting. Using quantitative approaches in molecular evolution and biophysical analyses, we studied 28 retrotransposon-derived genes, with a focus on the evolution of virus-like features. By analyzing the rate of synonymous substitutions, we show that the -1 programmed ribosomal frameshifting mechanism in three of these genes (PEG10, PNMA3, and PNMA5) is conserved across mammals and originates alternative proteins. These genes were targets of positive selection in primates, and one of the positively selected sites affects a B-cell epitope on the spike domain of the PNMA5 capsid, a finding reminiscent of observations in infectious viruses. More generally, we found that retrotransposon-derived proteins vary in their intrinsically disordered region content and this is directly associated with their evolutionary rates. Most positively selected sites in these proteins are located in intrinsically disordered regions and some of them impact protein posttranslational modifications, such as autocleavage and phosphorylation. Detailed analyses of the biophysical properties of intrinsically disordered regions showed that positive selection preferentially targeted regions with lower conformational entropy. Furthermore, positive selection introduces variation in binary sequence patterns across orthologues, as well as in chain compaction. Our results shed light on the evolutionary trajectories of a unique class of mammalian genes and suggest a novel approach to study how intrinsically disordered region biophysical characteristics are affected by evolution.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":"41 8","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11299033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141889761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We developed phyloBARCODER (https://github.com/jun-inoue/phyloBARCODER), a new web tool that can identify short DNA sequences to the species level using metabarcoding. phyloBARCODER estimates phylogenetic trees based on the uploaded anonymous DNA sequences and reference sequences from databases. Without such phylogenetic contexts, alternative, similarity-based methods independently identify species names and anonymous sequences of the same group by pairwise comparisons between queries and database sequences, with the caveat that they must match exactly or very closely. By putting metabarcoding sequences into a phylogenetic context, phyloBARCODER accurately identifies (i) species or classification of query sequences and (ii) anonymous sequences associated with the same species or even with populations of query sequences, with clear and accurate explanations. Version 1 of phyloBARCODER stores a database comprising all eukaryotic mitochondrial gene sequences. Moreover, by uploading their own databases, phyloBARCODER users can conduct species identification specialized for sequences obtained from a local geographic region or those of nonmitochondrial genes, e.g. ITS or rbcL.
我们开发了phyloBARCODER (https://github.com/jun-inoue/phyloBARCODER),这是一种新的网络工具,可以利用元条码技术识别物种水平的短DNA序列。phyloBARCODER根据上传的匿名DNA序列和数据库中的参考序列估算系统发生树。如果没有这样的系统发育背景,其他基于相似性的方法只能通过对查询序列和数据库序列进行配对比较来独立识别同组的物种名称和匿名序列,但需要注意的是,这些序列必须完全匹配或非常接近。通过将元条码序列置于系统发育的背景下,phyloBARCODER 可以准确识别(1)查询序列的物种或分类,以及(2)与同一物种甚至与查询序列的种群相关的匿名序列,并给出清晰准确的解释。第一版 phyloBARCODER 存储了一个包含所有真核生物线粒体基因序列的数据库。此外,通过上传自己的数据库,phyloBARCODER 用户还可以对从本地地理区域获得的序列或非线粒体基因(如 ITS 或 rbcL)序列进行专门的物种鉴定。
{"title":"phyloBARCODER: A Web Tool for Phylogenetic Classification of Eukaryote Metabarcodes Using Custom Reference Databases.","authors":"Jun Inoue, Chuya Shinzato, Junya Hirai, Sachihiko Itoh, Yuki Minegishi, Shin-Ichi Ito, Susumu Hyodo","doi":"10.1093/molbev/msae111","DOIUrl":"10.1093/molbev/msae111","url":null,"abstract":"<p><p>We developed phyloBARCODER (https://github.com/jun-inoue/phyloBARCODER), a new web tool that can identify short DNA sequences to the species level using metabarcoding. phyloBARCODER estimates phylogenetic trees based on the uploaded anonymous DNA sequences and reference sequences from databases. Without such phylogenetic contexts, alternative, similarity-based methods independently identify species names and anonymous sequences of the same group by pairwise comparisons between queries and database sequences, with the caveat that they must match exactly or very closely. By putting metabarcoding sequences into a phylogenetic context, phyloBARCODER accurately identifies (i) species or classification of query sequences and (ii) anonymous sequences associated with the same species or even with populations of query sequences, with clear and accurate explanations. Version 1 of phyloBARCODER stores a database comprising all eukaryotic mitochondrial gene sequences. Moreover, by uploading their own databases, phyloBARCODER users can conduct species identification specialized for sequences obtained from a local geographic region or those of nonmitochondrial genes, e.g. ITS or rbcL.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11297486/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141293566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sha Tan, Lan Liu, Jian-Yu Jiao, Meng-Meng Li, Chao-Jian Hu, Ai-Ping Lv, Yan-Ling Qi, Yu-Xian Li, Yang-Zhi Rao, Yan-Ni Qu, Hong-Chen Jiang, Rochelle M Soo, Paul N Evans, Zheng-Shuang Hua, Wen-Jun Li
Cyanobacteriota, the sole prokaryotes capable of oxygenic photosynthesis (OxyP), occupy a unique and pivotal role in Earth's history. While the notion that OxyP may have originated from Cyanobacteriota is widely accepted, its early evolution remains elusive. Here, by using both metagenomics and metatranscriptomics, we explore 36 metagenome-assembled genomes from hot spring ecosystems, belonging to two deep-branching cyanobacterial orders: Thermostichales and Gloeomargaritales. Functional investigation reveals that Thermostichales encode the crucial thylakoid membrane biogenesis protein, vesicle-inducing protein in plastids 1 (Vipp1). Based on the phylogenetic results, we infer that the evolution of the thylakoid membrane predates the divergence of Thermostichales from other cyanobacterial groups and that Thermostichales may be the most ancient lineage known to date to have inherited this feature from their common ancestor. Apart from OxyP, both lineages are potentially capable of sulfide-driven AnoxyP by linking sulfide oxidation to the photosynthetic electron transport chain. Unexpectedly, this AnoxyP capacity appears to be an acquired feature, as the key gene sqr was horizontally transferred from later-evolved cyanobacterial lineages. The presence of two D1 protein variants in Thermostichales suggests the functional flexibility of photosystems, ensuring their survival in fluctuating redox environments. Furthermore, all MAGs feature streamlined phycobilisomes with a preference for capturing longer-wavelength light, implying a unique evolutionary trajectory. Collectively, these results reveal the photosynthetic flexibility in these early-diverging cyanobacterial lineages, shedding new light on the early evolution of Cyanobacteriota and their photosynthetic processes.
{"title":"Exploring the Origins and Evolution of Oxygenic and Anoxygenic Photosynthesis in Deeply Branched Cyanobacteriota.","authors":"Sha Tan, Lan Liu, Jian-Yu Jiao, Meng-Meng Li, Chao-Jian Hu, Ai-Ping Lv, Yan-Ling Qi, Yu-Xian Li, Yang-Zhi Rao, Yan-Ni Qu, Hong-Chen Jiang, Rochelle M Soo, Paul N Evans, Zheng-Shuang Hua, Wen-Jun Li","doi":"10.1093/molbev/msae151","DOIUrl":"10.1093/molbev/msae151","url":null,"abstract":"<p><p>Cyanobacteriota, the sole prokaryotes capable of oxygenic photosynthesis (OxyP), occupy a unique and pivotal role in Earth's history. While the notion that OxyP may have originated from Cyanobacteriota is widely accepted, its early evolution remains elusive. Here, by using both metagenomics and metatranscriptomics, we explore 36 metagenome-assembled genomes from hot spring ecosystems, belonging to two deep-branching cyanobacterial orders: Thermostichales and Gloeomargaritales. Functional investigation reveals that Thermostichales encode the crucial thylakoid membrane biogenesis protein, vesicle-inducing protein in plastids 1 (Vipp1). Based on the phylogenetic results, we infer that the evolution of the thylakoid membrane predates the divergence of Thermostichales from other cyanobacterial groups and that Thermostichales may be the most ancient lineage known to date to have inherited this feature from their common ancestor. Apart from OxyP, both lineages are potentially capable of sulfide-driven AnoxyP by linking sulfide oxidation to the photosynthetic electron transport chain. Unexpectedly, this AnoxyP capacity appears to be an acquired feature, as the key gene sqr was horizontally transferred from later-evolved cyanobacterial lineages. The presence of two D1 protein variants in Thermostichales suggests the functional flexibility of photosystems, ensuring their survival in fluctuating redox environments. Furthermore, all MAGs feature streamlined phycobilisomes with a preference for capturing longer-wavelength light, implying a unique evolutionary trajectory. Collectively, these results reveal the photosynthetic flexibility in these early-diverging cyanobacterial lineages, shedding new light on the early evolution of Cyanobacteriota and their photosynthetic processes.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304991/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141748581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We here present CLUES2, a full-likelihood method to infer natural selection from sequence data that is an extension of the method CLUES. We make several substantial improvements to the CLUES method that greatly increases both its applicability and its speed. We add the ability to use ancestral recombination graphs on ancient data as emissions to the underlying hidden Markov model, which enables CLUES2 to use both temporal and linkage information to make estimates of selection coefficients. We also fully implement the ability to estimate distinct selection coefficients in different epochs, which allows for the analysis of changes in selective pressures through time, as well as selection with dominance. In addition, we greatly increase the computational efficiency of CLUES2 over CLUES using several approximations to the forward-backward algorithms and develop a new way to reconstruct historic allele frequencies by integrating over the uncertainty in the estimation of the selection coefficients. We illustrate the accuracy of CLUES2 through extensive simulations and validate the importance sampling framework for integrating over the uncertainty in the inference of gene trees. We also show that CLUES2 is well-calibrated by showing that under the null hypothesis, the distribution of log-likelihood ratios follows a χ2 distribution with the appropriate degrees of freedom. We run CLUES2 on a set of recently published ancient human data from Western Eurasia and test for evidence of changing selection coefficients through time. We find significant evidence of changing selective pressures in several genes correlated with the introduction of agriculture to Europe and the ensuing dietary and demographic shifts of that time. In particular, our analysis supports previous hypotheses of strong selection on lactase persistence during periods of ancient famines and attenuated selection in more modern periods.
{"title":"Fast and Accurate Estimation of Selection Coefficients and Allele Histories from Ancient and Modern DNA.","authors":"Andrew H Vaughn, Rasmus Nielsen","doi":"10.1093/molbev/msae156","DOIUrl":"10.1093/molbev/msae156","url":null,"abstract":"<p><p>We here present CLUES2, a full-likelihood method to infer natural selection from sequence data that is an extension of the method CLUES. We make several substantial improvements to the CLUES method that greatly increases both its applicability and its speed. We add the ability to use ancestral recombination graphs on ancient data as emissions to the underlying hidden Markov model, which enables CLUES2 to use both temporal and linkage information to make estimates of selection coefficients. We also fully implement the ability to estimate distinct selection coefficients in different epochs, which allows for the analysis of changes in selective pressures through time, as well as selection with dominance. In addition, we greatly increase the computational efficiency of CLUES2 over CLUES using several approximations to the forward-backward algorithms and develop a new way to reconstruct historic allele frequencies by integrating over the uncertainty in the estimation of the selection coefficients. We illustrate the accuracy of CLUES2 through extensive simulations and validate the importance sampling framework for integrating over the uncertainty in the inference of gene trees. We also show that CLUES2 is well-calibrated by showing that under the null hypothesis, the distribution of log-likelihood ratios follows a χ2 distribution with the appropriate degrees of freedom. We run CLUES2 on a set of recently published ancient human data from Western Eurasia and test for evidence of changing selection coefficients through time. We find significant evidence of changing selective pressures in several genes correlated with the introduction of agriculture to Europe and the ensuing dietary and demographic shifts of that time. In particular, our analysis supports previous hypotheses of strong selection on lactase persistence during periods of ancient famines and attenuated selection in more modern periods.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11321360/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}