Magdalena Djordjevic, Lidija Zivkovic, Hong-Yu Ou, Marko Djordjevic
Type II restriction-modification (R–M) systems play a pivotal role in bacterial defense against invading DNA, influencing the spread of pathogenic traits. These systems often involve coordinated expression of a regulatory protein (C) with restriction (R) enzymes, employing complex feedback loops for regulation. Recent studies highlight the crucial balance between R and M enzymes in controlling horizontal gene transfer (HGT). This manuscript introduces a mathematical model reflecting R–M system dynamics, informed by biophysical evidence, to minimize reliance on arbitrary parameters. Our analysis clarifies the observed variations in M-to-R ratios, emphasizing the regulatory role of the C protein. We analytically derived a stability diagram for C-regulated R–M systems, offering a more straightforward analysis method over traditional numerical approaches. Our findings reveal conditions leading to both monostability and bistability, linking changes in the M-to-R ratio to factors like cell division timing and plasmid replication rates. These variations may link adjusting defense against phage infection, or the acquisition of new genes such as antibiotic resistance determinants, to changing physiological conditions. We also performed stochastic simulations to show that system regulation may significantly increase M-to-R ratio variability, providing an additional mechanism to generate heterogeneity in bacterial population.
{"title":"Nonlinear regulatory dynamics of bacterial restriction-modification systems modulates horizontal gene transfer susceptibility","authors":"Magdalena Djordjevic, Lidija Zivkovic, Hong-Yu Ou, Marko Djordjevic","doi":"10.1093/nar/gkae1322","DOIUrl":"https://doi.org/10.1093/nar/gkae1322","url":null,"abstract":"Type II restriction-modification (R–M) systems play a pivotal role in bacterial defense against invading DNA, influencing the spread of pathogenic traits. These systems often involve coordinated expression of a regulatory protein (C) with restriction (R) enzymes, employing complex feedback loops for regulation. Recent studies highlight the crucial balance between R and M enzymes in controlling horizontal gene transfer (HGT). This manuscript introduces a mathematical model reflecting R–M system dynamics, informed by biophysical evidence, to minimize reliance on arbitrary parameters. Our analysis clarifies the observed variations in M-to-R ratios, emphasizing the regulatory role of the C protein. We analytically derived a stability diagram for C-regulated R–M systems, offering a more straightforward analysis method over traditional numerical approaches. Our findings reveal conditions leading to both monostability and bistability, linking changes in the M-to-R ratio to factors like cell division timing and plasmid replication rates. These variations may link adjusting defense against phage infection, or the acquisition of new genes such as antibiotic resistance determinants, to changing physiological conditions. We also performed stochastic simulations to show that system regulation may significantly increase M-to-R ratio variability, providing an additional mechanism to generate heterogeneity in bacterial population.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"205 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sagnik Sen, Pierre-Olivier Estève, Karthikeyan Raman, Julie Beaulieu, Hang Gyeong Chin, George R Feehery, Udayakumar S Vishnu, Shuang-yong Xu, James C Samuelson, Sriharsa Pradhan
Gene expression is regulated by chromatin DNA methylation and other features, including histone post-translational modifications (PTMs), chromatin remodelers and transcription factor occupancy. A complete understanding of gene regulation will require the mapping of these chromatin features in small cell number samples. Here we describe a novel genome-wide chromatin profiling technology, named as Nicking Enzyme Epitope targeted DNA sequencing (NEED-seq). NEED-seq offers antibody-targeted controlled nicking by Nt.CviPII-pGL fusion to study specific protein–DNA complexes in formaldehyde fixed cells, allowing for both visual and genomic resolution of epitope bound chromatin. When applied to nuclei, NEED-seq yielded genome-wide profile of chromatin-associated proteins and histone PTMs. Additionally, NEED-seq of lamin B1 and B2 demonstrated their association with heterochromatin. Lamin B1- and B2-associated domains (LAD) segregated to three different states, and states with stronger LAD correlated with heterochromatic marks. Hi-C analysis displayed A and B compartment with equal lamin B1 and B2 distribution, although methylated DNA remained high in B compartment. LAD clustering with Hi-C resulted in subcompartments, with lamin B1 and B2 partitioning to facultative and constitutive heterochromatin, respectively, and were associated with neuronal development. Thus, lamin B1 and B2 show structural and functional partitioning in mammalian nucleus.
{"title":"Distinct structural and functional heterochromatin partitioning of lamin B1 and lamin B2 revealed using genome-wide nicking enzyme epitope targeted DNA sequencing","authors":"Sagnik Sen, Pierre-Olivier Estève, Karthikeyan Raman, Julie Beaulieu, Hang Gyeong Chin, George R Feehery, Udayakumar S Vishnu, Shuang-yong Xu, James C Samuelson, Sriharsa Pradhan","doi":"10.1093/nar/gkae1317","DOIUrl":"https://doi.org/10.1093/nar/gkae1317","url":null,"abstract":"Gene expression is regulated by chromatin DNA methylation and other features, including histone post-translational modifications (PTMs), chromatin remodelers and transcription factor occupancy. A complete understanding of gene regulation will require the mapping of these chromatin features in small cell number samples. Here we describe a novel genome-wide chromatin profiling technology, named as Nicking Enzyme Epitope targeted DNA sequencing (NEED-seq). NEED-seq offers antibody-targeted controlled nicking by Nt.CviPII-pGL fusion to study specific protein–DNA complexes in formaldehyde fixed cells, allowing for both visual and genomic resolution of epitope bound chromatin. When applied to nuclei, NEED-seq yielded genome-wide profile of chromatin-associated proteins and histone PTMs. Additionally, NEED-seq of lamin B1 and B2 demonstrated their association with heterochromatin. Lamin B1- and B2-associated domains (LAD) segregated to three different states, and states with stronger LAD correlated with heterochromatic marks. Hi-C analysis displayed A and B compartment with equal lamin B1 and B2 distribution, although methylated DNA remained high in B compartment. LAD clustering with Hi-C resulted in subcompartments, with lamin B1 and B2 partitioning to facultative and constitutive heterochromatin, respectively, and were associated with neuronal development. Thus, lamin B1 and B2 show structural and functional partitioning in mammalian nucleus.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"30 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongli Lin, Xinyun Ye, Wenyan Chen, Danni Hong, Lifang Liu, Feng Chen, Ning Sun, Keying Ye, Jizhou Hong, Yalin Zhang, Falong Lu, Lei Li, Jialiang Huang
Enhancer clusters, pivotal in mammalian development and diseases, can organize as enhancer networks to control cell identity and disease genes; however, the underlying mechanism remains largely unexplored. Here, we introduce eNet 2.0, a comprehensive tool for enhancer networks analysis during development and diseases based on single-cell chromatin accessibility data. eNet 2.0 extends our previous work eNet 1.0 by adding network topology, comparison and dynamics analyses to its network construction function. We reveal modularly organized enhancer networks, where inter-module interactions synergistically affect gene expression. Moreover, network alterations correlate with abnormal and dynamic gene expression in disease and development. eNet 2.0 is robust across diverse datasets. To facilitate application, we introduce eNetDB (https://enetdb.huanglabxmu.com), an enhancer network database leveraging extensive scATAC-seq (single-cell assay for transposase-accessible chromatin sequencing) datasets from human and mouse tissues. Together, our work provides a powerful computational tool and reveals that modularly organized enhancer networks contribute to gene expression robustness in mammalian development and diseases.
{"title":"Modular organization of enhancer network provides transcriptional robustness in mammalian development","authors":"Hongli Lin, Xinyun Ye, Wenyan Chen, Danni Hong, Lifang Liu, Feng Chen, Ning Sun, Keying Ye, Jizhou Hong, Yalin Zhang, Falong Lu, Lei Li, Jialiang Huang","doi":"10.1093/nar/gkae1323","DOIUrl":"https://doi.org/10.1093/nar/gkae1323","url":null,"abstract":"Enhancer clusters, pivotal in mammalian development and diseases, can organize as enhancer networks to control cell identity and disease genes; however, the underlying mechanism remains largely unexplored. Here, we introduce eNet 2.0, a comprehensive tool for enhancer networks analysis during development and diseases based on single-cell chromatin accessibility data. eNet 2.0 extends our previous work eNet 1.0 by adding network topology, comparison and dynamics analyses to its network construction function. We reveal modularly organized enhancer networks, where inter-module interactions synergistically affect gene expression. Moreover, network alterations correlate with abnormal and dynamic gene expression in disease and development. eNet 2.0 is robust across diverse datasets. To facilitate application, we introduce eNetDB (https://enetdb.huanglabxmu.com), an enhancer network database leveraging extensive scATAC-seq (single-cell assay for transposase-accessible chromatin sequencing) datasets from human and mouse tissues. Together, our work provides a powerful computational tool and reveals that modularly organized enhancer networks contribute to gene expression robustness in mammalian development and diseases.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"94 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faced with nutritional stress, some bacteria form endospores capable of enduring extreme conditions for long periods of time; yet the function of many proteins expressed during sporulation remains a mystery. We identify one such protein, KapD, as a 3′-exoribonuclease expressed under control of the mother cell-specific transcription factors SigE and SigK in Bacillus subtilis. KapD dynamically assembles over the spore surface through a direct interaction with the major crust protein CotY. KapD catalytic activity is essential for normal adhesiveness of spore surface layers. We identify the sigK mRNA as a key KapD substrate and and show that the stability of this transcript is regulated by CotY-mediated sequestration of KapD. SigK is tightly controlled through excision of a prophage-like element, transcriptional regulation and the removal of an inhibitory pro-sequence. Our findings uncover a fourth, post-transcriptional layer of control of sigK expression that couples late-stage gene expression in the mother cell to spore morphogenesis.
{"title":"Embedding a ribonuclease in the spore crust couples gene expression to spore development in Bacillus subtilis","authors":"Alexandre D’Halluin, Laetitia Gilet, Armand Lablaine, Olivier Pellegrini, Mónica Serrano, Anastasia Tolcan, Magali Ventroux, Sylvain Durand, Marion Hamon, Adriano O Henriques, Rut Carballido-López, Ciarán Condon","doi":"10.1093/nar/gkae1301","DOIUrl":"https://doi.org/10.1093/nar/gkae1301","url":null,"abstract":"Faced with nutritional stress, some bacteria form endospores capable of enduring extreme conditions for long periods of time; yet the function of many proteins expressed during sporulation remains a mystery. We identify one such protein, KapD, as a 3′-exoribonuclease expressed under control of the mother cell-specific transcription factors SigE and SigK in Bacillus subtilis. KapD dynamically assembles over the spore surface through a direct interaction with the major crust protein CotY. KapD catalytic activity is essential for normal adhesiveness of spore surface layers. We identify the sigK mRNA as a key KapD substrate and and show that the stability of this transcript is regulated by CotY-mediated sequestration of KapD. SigK is tightly controlled through excision of a prophage-like element, transcriptional regulation and the removal of an inhibitory pro-sequence. Our findings uncover a fourth, post-transcriptional layer of control of sigK expression that couples late-stage gene expression in the mother cell to spore morphogenesis.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"69 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weiqiang Lin, Yisu Li, Chuan Qiu, Binghao Zou, Yun Gong, Xiao Zhang, Di Tian, William Sherman, Fernando Sanchez, Di Wu, Kuan-Jui Su, Xinyi Xiao, Zhe Luo, Qing Tian, Yiping Chen, Hui Shen, Hongwen Deng
Bone is a multifaceted tissue requiring orchestrated interplays of diverse cells within specialized microenvironments. Although significant progress has been made in understanding cellular and molecular mechanisms of component cells of bone, revealing their spatial organization and interactions in native bone tissue microenvironment is crucial for advancing precision medicine, as they govern fundamental signaling pathways and functional dependencies among various bone cells. In this study, we present the first integrative high-resolution map of human bone and bone marrow, using spatial and single-cell transcriptomics profiling from femoral tissue. This multi-modal approach discovered a novel bone formation-specialized niche enriched with osteoblastic lineage cells and fibroblasts and unveiled critical cell–cell communications and co-localization patterns between osteoblastic lineage cells and other cells. Furthermore, we discovered a novel spatial gradient of cellular composition, gene expression and signaling pathway activities radiating from the trabecular bone. This comprehensive atlas delineates the intricate bone cellular architecture and illuminates key molecular processes and dependencies among cells that coordinate bone metabolism. In sum, our study provides an essential reference for the field of bone biology and lays the foundation for advanced mechanistic studies and precision medicine approaches in bone-related disorders.
{"title":"Mapping the spatial atlas of the human bone tissue integrating spatial and single-cell transcriptomics","authors":"Weiqiang Lin, Yisu Li, Chuan Qiu, Binghao Zou, Yun Gong, Xiao Zhang, Di Tian, William Sherman, Fernando Sanchez, Di Wu, Kuan-Jui Su, Xinyi Xiao, Zhe Luo, Qing Tian, Yiping Chen, Hui Shen, Hongwen Deng","doi":"10.1093/nar/gkae1298","DOIUrl":"https://doi.org/10.1093/nar/gkae1298","url":null,"abstract":"Bone is a multifaceted tissue requiring orchestrated interplays of diverse cells within specialized microenvironments. Although significant progress has been made in understanding cellular and molecular mechanisms of component cells of bone, revealing their spatial organization and interactions in native bone tissue microenvironment is crucial for advancing precision medicine, as they govern fundamental signaling pathways and functional dependencies among various bone cells. In this study, we present the first integrative high-resolution map of human bone and bone marrow, using spatial and single-cell transcriptomics profiling from femoral tissue. This multi-modal approach discovered a novel bone formation-specialized niche enriched with osteoblastic lineage cells and fibroblasts and unveiled critical cell–cell communications and co-localization patterns between osteoblastic lineage cells and other cells. Furthermore, we discovered a novel spatial gradient of cellular composition, gene expression and signaling pathway activities radiating from the trabecular bone. This comprehensive atlas delineates the intricate bone cellular architecture and illuminates key molecular processes and dependencies among cells that coordinate bone metabolism. In sum, our study provides an essential reference for the field of bone biology and lays the foundation for advanced mechanistic studies and precision medicine approaches in bone-related disorders.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"4 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kenny Jungfer, Štefan Moravčík, Carmela Garcia-Doval, Anna Knörlein, Jonathan Hall, Martin Jinek
Type III clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) systems (type III CRISPR-Cas systems) use guide RNAs to recognize RNA transcripts of foreign genetic elements, which triggers the generation of cyclic oligoadenylate (cOA) second messengers by the Cas10 subunit of the type III effector complex. In turn, cOAs bind and activate ancillary effector proteins to reinforce the host immune response. Type III systems utilize distinct cOAs, including cyclic tri- (cA3), tetra- (cA4) and hexa-adenylates (cA6). However, the molecular mechanisms dictating cOA product identity are poorly understood. Here we used cryoelectron microscopy to visualize the mechanism of cA6 biosynthesis by the Csm effector complex from Enterococcus italicus (EiCsm). We show that EiCsm synthesizes oligoadenylate nucleotides in 3′–5′ direction using a set of conserved binding sites in the Cas10 Palm domains to determine the size of the nascent oligoadenylate chain. Our data also reveal that conformational dynamics induced by target RNA binding results in allosteric activation of Cas10 to trigger oligoadenylate synthesis. Mutations of a key structural element in Cas10 perturb cOA synthesis to favor cA3 and cA4 formation. Together, these results provide comprehensive insights into the dynamics of cOA synthesis in type III CRISPR-Cas systems and reveal key determinants of second messenger product selectivity, thereby illuminating potential avenues for their engineering.
{"title":"Mechanistic determinants and dynamics of cA6 synthesis in type III CRISPR-Cas effector complexes","authors":"Kenny Jungfer, Štefan Moravčík, Carmela Garcia-Doval, Anna Knörlein, Jonathan Hall, Martin Jinek","doi":"10.1093/nar/gkae1277","DOIUrl":"https://doi.org/10.1093/nar/gkae1277","url":null,"abstract":"Type III clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) systems (type III CRISPR-Cas systems) use guide RNAs to recognize RNA transcripts of foreign genetic elements, which triggers the generation of cyclic oligoadenylate (cOA) second messengers by the Cas10 subunit of the type III effector complex. In turn, cOAs bind and activate ancillary effector proteins to reinforce the host immune response. Type III systems utilize distinct cOAs, including cyclic tri- (cA3), tetra- (cA4) and hexa-adenylates (cA6). However, the molecular mechanisms dictating cOA product identity are poorly understood. Here we used cryoelectron microscopy to visualize the mechanism of cA6 biosynthesis by the Csm effector complex from Enterococcus italicus (EiCsm). We show that EiCsm synthesizes oligoadenylate nucleotides in 3′–5′ direction using a set of conserved binding sites in the Cas10 Palm domains to determine the size of the nascent oligoadenylate chain. Our data also reveal that conformational dynamics induced by target RNA binding results in allosteric activation of Cas10 to trigger oligoadenylate synthesis. Mutations of a key structural element in Cas10 perturb cOA synthesis to favor cA3 and cA4 formation. Together, these results provide comprehensive insights into the dynamics of cOA synthesis in type III CRISPR-Cas systems and reveal key determinants of second messenger product selectivity, thereby illuminating potential avenues for their engineering.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"19 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emanuela Ruggiero, Maja Marušič, Irene Zanin, Cristian David Peña Martinez, Daniel Christ, Janez Plavec, Sara N Richter
i-Motifs (iMs) are quadruplex nucleic acid conformations that form in cytosine-rich regions. Because of their acidic pH dependence, iMs were thought to form only in vitro. The recent development of an iM-selective antibody, iMab, has allowed iM detection in cells, which revealed their presence at gene promoters and their cell cycle dependence. However, recent evidence emerged which appeared to suggest that iMab recognizes C-rich sequences regardless of their iM conformation. To further investigate the selectivity of iMab, we examined the binding of iMab to C-rich sequences, using a combination of pull-down and western blot assays. Here, we observe that the composition of buffers used during binding and washing steps strongly influences the selectivity of antibody binding. In addition, we demonstrate by nuclear magnetic resonance that several of the previously reported C-rich sequences, which were not expected to form iMs, actually form intermolecular iMs which are selectively recognized by iMab. Our results highlight the specificity of the iMab antibody, emphasize the importance of avoiding in vitro artifacts by optimizing DNA concentrations, blocking and washing conditions, and confirm that iMab is selective not only for intramolecular iMs but also for intermolecular iMs, while not affecting the iM conformation.
{"title":"The iMab antibody selectively binds to intramolecular and intermolecular i-motif structures","authors":"Emanuela Ruggiero, Maja Marušič, Irene Zanin, Cristian David Peña Martinez, Daniel Christ, Janez Plavec, Sara N Richter","doi":"10.1093/nar/gkae1305","DOIUrl":"https://doi.org/10.1093/nar/gkae1305","url":null,"abstract":"i-Motifs (iMs) are quadruplex nucleic acid conformations that form in cytosine-rich regions. Because of their acidic pH dependence, iMs were thought to form only in vitro. The recent development of an iM-selective antibody, iMab, has allowed iM detection in cells, which revealed their presence at gene promoters and their cell cycle dependence. However, recent evidence emerged which appeared to suggest that iMab recognizes C-rich sequences regardless of their iM conformation. To further investigate the selectivity of iMab, we examined the binding of iMab to C-rich sequences, using a combination of pull-down and western blot assays. Here, we observe that the composition of buffers used during binding and washing steps strongly influences the selectivity of antibody binding. In addition, we demonstrate by nuclear magnetic resonance that several of the previously reported C-rich sequences, which were not expected to form iMs, actually form intermolecular iMs which are selectively recognized by iMab. Our results highlight the specificity of the iMab antibody, emphasize the importance of avoiding in vitro artifacts by optimizing DNA concentrations, blocking and washing conditions, and confirm that iMab is selective not only for intramolecular iMs but also for intermolecular iMs, while not affecting the iM conformation.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"77 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veniamin Fishman, Yuri Kuratov, Aleksei Shmelev, Maxim Petrov, Dmitry Penzar, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, Mikhail Burtsev
Recent advancements in genomics, propelled by artificial intelligence, have unlocked unprecedented capabilities in interpreting genomic sequences, mitigating the need for exhaustive experimental analysis of complex, intertwined molecular processes inherent in DNA function. A significant challenge, however, resides in accurately decoding genomic sequences, which inherently involves comprehending rich contextual information dispersed across thousands of nucleotides. To address this need, we introduce GENA language model (GENA-LM), a suite of transformer-based foundational DNA language models capable of handling input lengths up to 36 000 base pairs. Notably, integrating the newly developed recurrent memory mechanism allows these models to process even larger DNA segments. We provide pre-trained versions of GENA-LM, including multispecies and taxon-specific models, demonstrating their capability for fine-tuning and addressing a spectrum of complex biological tasks with modest computational demands. While language models have already achieved significant breakthroughs in protein biology, GENA-LM showcases a similarly promising potential for reshaping the landscape of genomics and multi-omics data analysis. All models are publicly available on GitHub (https://github.com/AIRI-Institute/GENA_LM) and on HuggingFace (https://huggingface.co/AIRI-Institute). In addition, we provide a web service (https://dnalm.airi.net/) allowing user-friendly DNA annotation with GENA-LM models.
{"title":"GENA-LM: a family of open-source foundational DNA language models for long sequences","authors":"Veniamin Fishman, Yuri Kuratov, Aleksei Shmelev, Maxim Petrov, Dmitry Penzar, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, Mikhail Burtsev","doi":"10.1093/nar/gkae1310","DOIUrl":"https://doi.org/10.1093/nar/gkae1310","url":null,"abstract":"Recent advancements in genomics, propelled by artificial intelligence, have unlocked unprecedented capabilities in interpreting genomic sequences, mitigating the need for exhaustive experimental analysis of complex, intertwined molecular processes inherent in DNA function. A significant challenge, however, resides in accurately decoding genomic sequences, which inherently involves comprehending rich contextual information dispersed across thousands of nucleotides. To address this need, we introduce GENA language model (GENA-LM), a suite of transformer-based foundational DNA language models capable of handling input lengths up to 36 000 base pairs. Notably, integrating the newly developed recurrent memory mechanism allows these models to process even larger DNA segments. We provide pre-trained versions of GENA-LM, including multispecies and taxon-specific models, demonstrating their capability for fine-tuning and addressing a spectrum of complex biological tasks with modest computational demands. While language models have already achieved significant breakthroughs in protein biology, GENA-LM showcases a similarly promising potential for reshaping the landscape of genomics and multi-omics data analysis. All models are publicly available on GitHub (https://github.com/AIRI-Institute/GENA_LM) and on HuggingFace (https://huggingface.co/AIRI-Institute). In addition, we provide a web service (https://dnalm.airi.net/) allowing user-friendly DNA annotation with GENA-LM models.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"1 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The CRISPR-derived endoribonuclease Csy4 is a popular tool for controlling transgene expression in various therapeutically relevant settings, but adverse effects potentially arising from non-specific RNA cleavage remains largely unexplored. Here, we report a split-Csy4 architecture that was carefully optimized for in vivo usage. First, we separated Csy4 into two independent protein moieties whose full catalytic activity can be restored via various constitutive or conditional protein dimerization systems. Next, we show that introduction of split-Csy4 into human cells caused a substantially reduced extent in perturbation of the endogenous transcriptome when directly compared to full-length Csy4. Inspired by these results, we went on to use such split-Csy4 module to engineer inducible CRISPR- and translation-level gene switches regulated by the FDA-approved drug grazoprevir. This work provides valuable resource for Csy4-related biomedical research and discusses important issues for the development of clinically eligible regulation tools.
{"title":"A trigger-inducible split-Csy4 architecture for programmable RNA modulation","authors":"Lihang Zhang, Xinyuan Qiu, Yuting Zhou, Zhengyang Luo, Lingyun Zhu, Jiawei Shao, Mingqi Xie, Hui Wang","doi":"10.1093/nar/gkae1319","DOIUrl":"https://doi.org/10.1093/nar/gkae1319","url":null,"abstract":"The CRISPR-derived endoribonuclease Csy4 is a popular tool for controlling transgene expression in various therapeutically relevant settings, but adverse effects potentially arising from non-specific RNA cleavage remains largely unexplored. Here, we report a split-Csy4 architecture that was carefully optimized for in vivo usage. First, we separated Csy4 into two independent protein moieties whose full catalytic activity can be restored via various constitutive or conditional protein dimerization systems. Next, we show that introduction of split-Csy4 into human cells caused a substantially reduced extent in perturbation of the endogenous transcriptome when directly compared to full-length Csy4. Inspired by these results, we went on to use such split-Csy4 module to engineer inducible CRISPR- and translation-level gene switches regulated by the FDA-approved drug grazoprevir. This work provides valuable resource for Csy4-related biomedical research and discusses important issues for the development of clinically eligible regulation tools.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"23 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
U6 snRNA (small nuclear ribonucleic acid) is a ribozyme that catalyzes pre-messenger RNA (pre-mRNA) splicing and undergoes epitranscriptomic modifications. After transcription, the 3′-end of U6 snRNA is oligo-uridylylated by the multi-domain terminal uridylyltransferase (TUTase), TUT1. The 3′- oligo-uridylylated tail of U6 snRNA is crucial for U4/U6 di-snRNP (small nuclear ribonucleoprotein) formation and pre-mRNA splicing. Here, we present the cryo-electron microscopy structure of the human TUT1:U6 snRNA complex. The AUA-rich motif between the 5′-short stem-loop and the telestem of U6 snRNA is clamped by the N-terminal zinc finger (ZF)–RNA recognition motif and the catalytic Palm of TUT1, and the telestem is gripped by the N-terminal ZF and the Fingers, positioning the 3′-end of the telestem in the catalytic pocket. The internal stem-loop in the 3′-stem-loop of U6 snRNA is anchored by the C-terminal kinase-associated 1 domain, preventing U6 snRNA from dislodging on the TUT1 surface during oligo-uridylylation. TUT1 recognizes the sequence and structural features of U6 snRNA, and holds the entire U6 snRNA body using multiple domains to ensure oligo-uridylylation. This highlights the specificity of TUT1 as a U6 snRNA-targeting TUTase.
{"title":"Cryo-EM structure of human TUT1:U6 snRNA complex","authors":"Seisuke Yamashita, Kozo Tomita","doi":"10.1093/nar/gkae1314","DOIUrl":"https://doi.org/10.1093/nar/gkae1314","url":null,"abstract":"U6 snRNA (small nuclear ribonucleic acid) is a ribozyme that catalyzes pre-messenger RNA (pre-mRNA) splicing and undergoes epitranscriptomic modifications. After transcription, the 3′-end of U6 snRNA is oligo-uridylylated by the multi-domain terminal uridylyltransferase (TUTase), TUT1. The 3′- oligo-uridylylated tail of U6 snRNA is crucial for U4/U6 di-snRNP (small nuclear ribonucleoprotein) formation and pre-mRNA splicing. Here, we present the cryo-electron microscopy structure of the human TUT1:U6 snRNA complex. The AUA-rich motif between the 5′-short stem-loop and the telestem of U6 snRNA is clamped by the N-terminal zinc finger (ZF)–RNA recognition motif and the catalytic Palm of TUT1, and the telestem is gripped by the N-terminal ZF and the Fingers, positioning the 3′-end of the telestem in the catalytic pocket. The internal stem-loop in the 3′-stem-loop of U6 snRNA is anchored by the C-terminal kinase-associated 1 domain, preventing U6 snRNA from dislodging on the TUT1 surface during oligo-uridylylation. TUT1 recognizes the sequence and structural features of U6 snRNA, and holds the entire U6 snRNA body using multiple domains to ensure oligo-uridylylation. This highlights the specificity of TUT1 as a U6 snRNA-targeting TUTase.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"42 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}