DNA is most often found in its canonical B-form double-helical structure, but can also adopt alternative conformations, known as non-B DNA structures. Numerous non-B structures have been characterized, including G-quadruplexes, i-motifs, Z-DNA, hairpins, cruciforms, slipped structures, R-loops, and H-DNA. Non-B DNA motifs are enriched in functional regions, including near transcription start and end sites, topologically associated domains, and replication origins, suggesting their importance in gene regulation, genome organization, and replication. However, these structures are intrinsically prone to error-generating processing, leading to genomic instability and hence have been implicated in the development of human diseases. Here, we discuss recent advances in understanding the biological roles of non-B DNA structures and their contribution to genomic instability in somatic and germline contexts. We highlight how they promote replication stress, transcription stalling, and DNA breaks, resulting in the formation of mutational hotspots. Emerging technologies have enabled the detailed mapping of previously challenging repetitive regions that harbor potential non-B DNA-forming sequences, and are poised to unravel additional contributions in human disease and evolution. Furthermore, we explore the dual role of non-B DNA as a driver of genetic variation that facilitates evolutionary adaptation and as a source of mutations that contribute to tissue dysfunction and aging.
{"title":"Non-B DNA structures and their contributions to genetic diversity, aging, and disease.","authors":"Eleftherios Bochalis, Irene Dereki, Guliang Wang, Argyro Sgourou, Karen M Vasquez, Ilias Georgakopoulos-Soares","doi":"10.1093/nar/gkag084","DOIUrl":"10.1093/nar/gkag084","url":null,"abstract":"<p><p>DNA is most often found in its canonical B-form double-helical structure, but can also adopt alternative conformations, known as non-B DNA structures. Numerous non-B structures have been characterized, including G-quadruplexes, i-motifs, Z-DNA, hairpins, cruciforms, slipped structures, R-loops, and H-DNA. Non-B DNA motifs are enriched in functional regions, including near transcription start and end sites, topologically associated domains, and replication origins, suggesting their importance in gene regulation, genome organization, and replication. However, these structures are intrinsically prone to error-generating processing, leading to genomic instability and hence have been implicated in the development of human diseases. Here, we discuss recent advances in understanding the biological roles of non-B DNA structures and their contribution to genomic instability in somatic and germline contexts. We highlight how they promote replication stress, transcription stalling, and DNA breaks, resulting in the formation of mutational hotspots. Emerging technologies have enabled the detailed mapping of previously challenging repetitive regions that harbor potential non-B DNA-forming sequences, and are poised to unravel additional contributions in human disease and evolution. Furthermore, we explore the dual role of non-B DNA as a driver of genetic variation that facilitates evolutionary adaptation and as a source of mutations that contribute to tissue dysfunction and aging.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12887540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146150390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CRISPR-Cas9 knock-in efficiency is often limited by geometric misalignment between donor DNA and the endogenous strand-invasion path. In Aspergillus nidulans, we found that integration drops sharply when the insertion site is offset from the invasion entry point, producing premature annealing or unsupported 3' ends that stall DNA synthesis. Chromatin immunoprecipitation-based profiling shows directional loading of the RAD51 homolog UvsC around Cas9-induced double-strand breaks, thereby defining the spatial origin of strand invasion. Guided by this insight, we introduce a dual-single-guide RNA design that places two cuts flanking the insertion site to create a geometry-matched strand-invasion window. This alignment consistently and markedly increases homology-directed-repair-mediated integration across insert sizes and editing tasks-including C-terminal tagging, bidirectional promoter rewiring, and long-distance dual-site mutagenesis-and generalizes across multiple fungal species. We propose a structural-docking model in which pairing fidelity between the resected chromosomal strand and donor homology arms governs knock-in outcomes, providing a practical design principle for efficient and precise genome engineering at structurally constrained loci.
{"title":"Dual-single-guide RNA strategy improves CRISPR-mediated homology-directed repair in Aspergillus.","authors":"Mingxin Fu, Jing Wang, Jingyi Li, Yao Zhou, Xiaofei Huang, Zehan Jia, Yiqing Luo, Xinyu Tan, Yan Gao, Bingzi Yu, Yuting Duan, Qianyun Bu, Xiaoying Li, Yifan Wang, Naoki Takaya, Shengmin Zhou","doi":"10.1093/nar/gkag095","DOIUrl":"10.1093/nar/gkag095","url":null,"abstract":"<p><p>CRISPR-Cas9 knock-in efficiency is often limited by geometric misalignment between donor DNA and the endogenous strand-invasion path. In Aspergillus nidulans, we found that integration drops sharply when the insertion site is offset from the invasion entry point, producing premature annealing or unsupported 3' ends that stall DNA synthesis. Chromatin immunoprecipitation-based profiling shows directional loading of the RAD51 homolog UvsC around Cas9-induced double-strand breaks, thereby defining the spatial origin of strand invasion. Guided by this insight, we introduce a dual-single-guide RNA design that places two cuts flanking the insertion site to create a geometry-matched strand-invasion window. This alignment consistently and markedly increases homology-directed-repair-mediated integration across insert sizes and editing tasks-including C-terminal tagging, bidirectional promoter rewiring, and long-distance dual-site mutagenesis-and generalizes across multiple fungal species. We propose a structural-docking model in which pairing fidelity between the resected chromosomal strand and donor homology arms governs knock-in outcomes, providing a practical design principle for efficient and precise genome engineering at structurally constrained loci.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873602/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146125883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anja Trupej, Valter Bergant, Jona Novljan, Martin Dodel, Tajda Klobučar, Maksimiljan Adamek, Flora C Y Lee, Karen Yap, Eugene Makeyev, Boštjan Kokot, Luka Čehovin Zajc, Andreas Pichlmair, Iztok Urbančič, Faraz K Mardakheh, Miha Modic
The spatial organization of RNA-scaffolded condensates is fundamental for understanding of basic cellular functions, but may also provide pivotal insights into diseases. One of the major challenges to understanding the role of condensates is the lack of technologies to map condensate-scale protein architecture at subcompartmental resolution. To address this, we introduce HCR-Proxy, a proximity labelling technique that couples hybridization chain reaction (HCR)-based signal amplification with in situ proximity biotinylation (Proxy), enabling proteomic profiling of RNA-proximal proteomes at subcompartmental resolution. We applied HCR-Proxy to nascent pre-rRNA targets to investigate the distinct proteomic signatures of the nucleolar subcompartments and to uncover a spatial logic of protein partitioning shaped by RNA sequence. Our results demonstrate the ability of HCR-Proxy to provide spatially resolved maps of RNA interactomes within the nucleolus, offering new insights into the molecular organization and compartmentalization of condensates. This subcompartment-specific nucleolar proteome profiling enabled integration with deep learning frameworks, which effectively confirmed a sequence-encoded basis for protein partitioning across nested condensate subcompartments, characterized by antagonistic gradients in charge, molecular weight, and RNA-binding domains. HCR-Proxy thus provides a scalable platform for spatially resolved RNA interactome discovery, bridging transcript localization with proteomic context in native cellular environments.
{"title":"HCR-Proxy resolves site-specific proximal RNA microenvironments at subcompartmental resolution.","authors":"Anja Trupej, Valter Bergant, Jona Novljan, Martin Dodel, Tajda Klobučar, Maksimiljan Adamek, Flora C Y Lee, Karen Yap, Eugene Makeyev, Boštjan Kokot, Luka Čehovin Zajc, Andreas Pichlmair, Iztok Urbančič, Faraz K Mardakheh, Miha Modic","doi":"10.1093/nar/gkag086","DOIUrl":"10.1093/nar/gkag086","url":null,"abstract":"<p><p>The spatial organization of RNA-scaffolded condensates is fundamental for understanding of basic cellular functions, but may also provide pivotal insights into diseases. One of the major challenges to understanding the role of condensates is the lack of technologies to map condensate-scale protein architecture at subcompartmental resolution. To address this, we introduce HCR-Proxy, a proximity labelling technique that couples hybridization chain reaction (HCR)-based signal amplification with in situ proximity biotinylation (Proxy), enabling proteomic profiling of RNA-proximal proteomes at subcompartmental resolution. We applied HCR-Proxy to nascent pre-rRNA targets to investigate the distinct proteomic signatures of the nucleolar subcompartments and to uncover a spatial logic of protein partitioning shaped by RNA sequence. Our results demonstrate the ability of HCR-Proxy to provide spatially resolved maps of RNA interactomes within the nucleolus, offering new insights into the molecular organization and compartmentalization of condensates. This subcompartment-specific nucleolar proteome profiling enabled integration with deep learning frameworks, which effectively confirmed a sequence-encoded basis for protein partitioning across nested condensate subcompartments, characterized by antagonistic gradients in charge, molecular weight, and RNA-binding domains. HCR-Proxy thus provides a scalable platform for spatially resolved RNA interactome discovery, bridging transcript localization with proteomic context in native cellular environments.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12926915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147271576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to 'Splicing regulation and intron evolution in the short-intron ciliate model of endosymbiosis Paramecium bursaria'.","authors":"","doi":"10.1093/nar/gkag163","DOIUrl":"10.1093/nar/gkag163","url":null,"abstract":"","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12926910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147271612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Helen E King, Savannah O'Connell, Daisy Kavanagh, Sofia Mason, Cerys McCool, Javier Fernandez-Chamorro, Christine L Chaffer, Susan J Clark, Helaine Graziele S Vieira, Timothy Sterne-Weiler, Robert J Weatheritt
CRISPR interference (CRISPRi) screens have emerged as powerful tools for dissecting gene function, yet their application to genes with multiple promoters, which comprise over 60% of human genes, remains poorly understood. Here, we demonstrate that CRISPR-dCas9-based screens exhibit widespread promoter specificity, with untargeted promoters often showing compensatory upregulation to maintain gene expression. Leveraging this selective targeting of individual promoters within the same gene, we developed Isoform-Specific single-cell Perturb-Seq to systematically analyse alternative promoter function. Our analysis revealed that alternative promoters in 51.6% of targeted genes drive distinct transcriptional programs. This suggests that promoter selection represents a fundamental mechanism for generating cellular diversity rather than mere transcriptional redundancy. In breast cancer models, this promoter-specific targeting revealed differential effects on drug sensitivity, where distinct estrogen receptor (ESR1) promoters showed opposing influences on tamoxifen response and patient survival. These findings demonstrate the necessity of promoter-level analysis in functional genomics and suggest new strategies for therapeutic intervention through promoter-specific targeting.
{"title":"Isoform-specific single-cell perturb-seq reveals distinct functions of alternative promoters in drug response.","authors":"Helen E King, Savannah O'Connell, Daisy Kavanagh, Sofia Mason, Cerys McCool, Javier Fernandez-Chamorro, Christine L Chaffer, Susan J Clark, Helaine Graziele S Vieira, Timothy Sterne-Weiler, Robert J Weatheritt","doi":"10.1093/nar/gkag118","DOIUrl":"10.1093/nar/gkag118","url":null,"abstract":"<p><p>CRISPR interference (CRISPRi) screens have emerged as powerful tools for dissecting gene function, yet their application to genes with multiple promoters, which comprise over 60% of human genes, remains poorly understood. Here, we demonstrate that CRISPR-dCas9-based screens exhibit widespread promoter specificity, with untargeted promoters often showing compensatory upregulation to maintain gene expression. Leveraging this selective targeting of individual promoters within the same gene, we developed Isoform-Specific single-cell Perturb-Seq to systematically analyse alternative promoter function. Our analysis revealed that alternative promoters in 51.6% of targeted genes drive distinct transcriptional programs. This suggests that promoter selection represents a fundamental mechanism for generating cellular diversity rather than mere transcriptional redundancy. In breast cancer models, this promoter-specific targeting revealed differential effects on drug sensitivity, where distinct estrogen receptor (ESR1) promoters showed opposing influences on tamoxifen response and patient survival. These findings demonstrate the necessity of promoter-level analysis in functional genomics and suggest new strategies for therapeutic intervention through promoter-specific targeting.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12926921/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147271652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z-DNA is known to be a left-handed alternative form of DNA and has important biological roles in cancer and other genetic diseases. In a recent study, we discovered CBL0137, a curaxin ligand, to enhance cancer immunotherapy by inducing Z-DNA formation and activating the Z-DNA-binding protein ZBP1. However, the structural information on binding complexes between Z-DNA and CBL0137 ligand has not reported to date. Here we present the first high-resolution structure of the complex between a Z-DNA and a curaxin ligand CBL0137. This compound is observed to interact with the Z-DNA through π-stacking and zig-zag localization. Furthermore, we directly observe the complex in living human cells using in-cell 19F NMR for the first time. This structural information provides a platform for the design of topology-specific Z-DNA-targeting compounds and is valuable for the development of new potent anticancer drugs.
{"title":"Solution structure of Z-form DNA bound to a curaxin ligand CBL0137.","authors":"Feifan Liu, Shiyu Wang, Yan Xu","doi":"10.1093/nar/gkag104","DOIUrl":"10.1093/nar/gkag104","url":null,"abstract":"<p><p>Z-DNA is known to be a left-handed alternative form of DNA and has important biological roles in cancer and other genetic diseases. In a recent study, we discovered CBL0137, a curaxin ligand, to enhance cancer immunotherapy by inducing Z-DNA formation and activating the Z-DNA-binding protein ZBP1. However, the structural information on binding complexes between Z-DNA and CBL0137 ligand has not reported to date. Here we present the first high-resolution structure of the complex between a Z-DNA and a curaxin ligand CBL0137. This compound is observed to interact with the Z-DNA through π-stacking and zig-zag localization. Furthermore, we directly observe the complex in living human cells using in-cell 19F NMR for the first time. This structural information provides a platform for the design of topology-specific Z-DNA-targeting compounds and is valuable for the development of new potent anticancer drugs.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12887531/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nucleotide salvage is crucial for maintaining DNA replication when de novo nucleotide synthesis is limited, but this metabolic flexibility poses potential threats to genome stability. Salvage kinases phosphorylate nucleosides broadly, allowing for oxidized and alkylated 2'-deoxynucleosides as well as posttranscriptionally modified ribonucleosides to enter the 2'-deoxynucleoside triphosphate (dNTP) pool. The ensuing contamination of the dNTP pool and the subsequent incorporation of modified nucleotides into genomic DNA promote mutagenesis, induce replication stress, elicit double-strand breaks, and disrupt epigenetic signaling. Although only a small subset of modified nucleosides have been assessed for salvage and genomic incorporation, the scope of salvageable substrates is probably much wider, with significant implications in mutational burden, chromatin instability, and epigenetic regulation. This overlooked aspect of genome instability is especially relevant in biological contexts of high salvage activity or elevated nucleoside damage, including chronic inflammation, cancer, aging, and dietary/microbiome exposures. Emerging evidence links salvage metabolism to tumor progression, where incorporation of salvage-derived nucleotides may contribute to unexplainable mutational signatures detected in cancers, such as gastrointestinal cancer. Recognizing salvage as a hidden source of mutagenesis reshapes our understanding of genome instability and provides potential opportunities for disease prevention, diagnosis, and therapeutic intervention.
{"title":"Nucleotide salvage, genome instability, and potential therapeutic applications.","authors":"Pengcheng Wang, Chen Wang, Yinsheng Wang","doi":"10.1093/nar/gkag099","DOIUrl":"10.1093/nar/gkag099","url":null,"abstract":"<p><p>Nucleotide salvage is crucial for maintaining DNA replication when de novo nucleotide synthesis is limited, but this metabolic flexibility poses potential threats to genome stability. Salvage kinases phosphorylate nucleosides broadly, allowing for oxidized and alkylated 2'-deoxynucleosides as well as posttranscriptionally modified ribonucleosides to enter the 2'-deoxynucleoside triphosphate (dNTP) pool. The ensuing contamination of the dNTP pool and the subsequent incorporation of modified nucleotides into genomic DNA promote mutagenesis, induce replication stress, elicit double-strand breaks, and disrupt epigenetic signaling. Although only a small subset of modified nucleosides have been assessed for salvage and genomic incorporation, the scope of salvageable substrates is probably much wider, with significant implications in mutational burden, chromatin instability, and epigenetic regulation. This overlooked aspect of genome instability is especially relevant in biological contexts of high salvage activity or elevated nucleoside damage, including chronic inflammation, cancer, aging, and dietary/microbiome exposures. Emerging evidence links salvage metabolism to tumor progression, where incorporation of salvage-derived nucleotides may contribute to unexplainable mutational signatures detected in cancers, such as gastrointestinal cancer. Recognizing salvage as a hidden source of mutagenesis reshapes our understanding of genome instability and provides potential opportunities for disease prevention, diagnosis, and therapeutic intervention.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12887539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146150380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ofelia Karlsson, Ninoslav Pandiloski, Vivien Horvath, Anita Adami, Raquel Garza, Pia A Johansson, Jenny G Johansson, Christopher H Douse, Johan Jakobsson
Heterochromatin is characterized by an inaccessibility to the transcriptional machinery and is associated with the histone mark H3K9me3. However, studying the functional consequences of heterochromatin loss in human cells has been challenging. Here, we used CRISPRi-mediated silencing of the histone methyltransferase SETDB1 to remove H3K9me3 heterochromatin in human neural progenitor cells. Despite a major loss of H3K9me3 peaks resulting in genome-wide reorganization of heterochromatin domains, silencing of SETDB1 had a limited effect on cell viability. Cells remained proliferative and expressed appropriate marker genes. We found that a key event following the loss of SETDB1-mediated H3K9me3 was the expression of evolutionarily young L1 retrotransposons. Derepression of L1s was associated with a loss of CpG DNA methylation at their promoters, suggesting that deposition of H3K9me3 at the L1 promoter is required to maintain DNA methylation. In conclusion, these results demonstrate that loss of H3K9me3 in human neural somatic cells transcriptionally activates evolutionary young L1 retrotransposons.
{"title":"Loss of SETDB1-mediated H3K9me3 in human neural progenitor cells leads to transcriptional activation of L1 retrotransposons.","authors":"Ofelia Karlsson, Ninoslav Pandiloski, Vivien Horvath, Anita Adami, Raquel Garza, Pia A Johansson, Jenny G Johansson, Christopher H Douse, Johan Jakobsson","doi":"10.1093/nar/gkag100","DOIUrl":"10.1093/nar/gkag100","url":null,"abstract":"<p><p>Heterochromatin is characterized by an inaccessibility to the transcriptional machinery and is associated with the histone mark H3K9me3. However, studying the functional consequences of heterochromatin loss in human cells has been challenging. Here, we used CRISPRi-mediated silencing of the histone methyltransferase SETDB1 to remove H3K9me3 heterochromatin in human neural progenitor cells. Despite a major loss of H3K9me3 peaks resulting in genome-wide reorganization of heterochromatin domains, silencing of SETDB1 had a limited effect on cell viability. Cells remained proliferative and expressed appropriate marker genes. We found that a key event following the loss of SETDB1-mediated H3K9me3 was the expression of evolutionarily young L1 retrotransposons. Derepression of L1s was associated with a loss of CpG DNA methylation at their promoters, suggesting that deposition of H3K9me3 at the L1 promoter is required to maintain DNA methylation. In conclusion, these results demonstrate that loss of H3K9me3 in human neural somatic cells transcriptionally activates evolutionary young L1 retrotransposons.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873604/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146125903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Xia, Jinyuan Sun, Xiaowen Du, Zeyu Liang, Xin Wu, Wenyu Shi, Bin Shao, Shuyuan Guo, Yi-Xin Huo
Deep learning has successfully been applied to design cis-regulatory elements (CREs) for a few species, but a broadly applicable platform for generating functional promoters for thousands of prokaryotes remains lacking. In this study, we introduce a language model for prokaryotic CREs, referred to as PromoGen2, to design CREs without prior experimental data. PromoGen2 was pretrained on CREs derived from 17 000 prokaryotic genomes. It achieved the highest zero-shot prediction correlation of promoter strength across species, improving the average Spearman correlation from 0.27 to 0.50 compared to the best baseline, while reducing the number of parameters by 103. Artificial CREs designed with PromoGen2 demonstrated a 100% success rate in Escherichia coli, Bacillus subtilis, Bacillus licheniformis, and Agrobacterium tumefaciens. Based on PromoGen2, we developed the Promoter-Factory framework to design promoters from unannotated genomes. Experimental validation showed that most of the promoters designed for Jejubacter sp. L23, a newly isolated halophilic bacterium with no available CREs, were active and capable of driving lycopene overproduction. Additionally, we introduced PromoGen2-proka, a taxonomy-aware model for CRE design based on prokaryotic genera. Experimental validation confirmed its reliable success rate. The combined use of PromoGen2-proka and Promoter-Factory offers a broadly applicable tool for designing CREs for prokaryotes, fulfilling the needs of synthetic biology and microbiology research.
{"title":"Design prokaryotic cis-regulatory elements using language model.","authors":"Yan Xia, Jinyuan Sun, Xiaowen Du, Zeyu Liang, Xin Wu, Wenyu Shi, Bin Shao, Shuyuan Guo, Yi-Xin Huo","doi":"10.1093/nar/gkag122","DOIUrl":"10.1093/nar/gkag122","url":null,"abstract":"<p><p>Deep learning has successfully been applied to design cis-regulatory elements (CREs) for a few species, but a broadly applicable platform for generating functional promoters for thousands of prokaryotes remains lacking. In this study, we introduce a language model for prokaryotic CREs, referred to as PromoGen2, to design CREs without prior experimental data. PromoGen2 was pretrained on CREs derived from 17 000 prokaryotic genomes. It achieved the highest zero-shot prediction correlation of promoter strength across species, improving the average Spearman correlation from 0.27 to 0.50 compared to the best baseline, while reducing the number of parameters by 103. Artificial CREs designed with PromoGen2 demonstrated a 100% success rate in Escherichia coli, Bacillus subtilis, Bacillus licheniformis, and Agrobacterium tumefaciens. Based on PromoGen2, we developed the Promoter-Factory framework to design promoters from unannotated genomes. Experimental validation showed that most of the promoters designed for Jejubacter sp. L23, a newly isolated halophilic bacterium with no available CREs, were active and capable of driving lycopene overproduction. Additionally, we introduced PromoGen2-proka, a taxonomy-aware model for CRE design based on prokaryotic genera. Experimental validation confirmed its reliable success rate. The combined use of PromoGen2-proka and Promoter-Factory offers a broadly applicable tool for designing CREs for prokaryotes, fulfilling the needs of synthetic biology and microbiology research.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12907563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146202390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhe Zhang, Haicheng Li, Aili Ju, Fei Ye, Fan Wei, Yongqiang Liu, Junhua Niu, Hongzhen Jiang, Yuanyuan Wang, Shan Gao
Eukaryotic gene expression is dynamically regulated through the interplay between histone modifications and chromatin remodeling, yet how these processes are coordinated remains incompletely understood. Here, we uncover IBD1 as a critical adaptor that bridges histone acetylation and SWR-mediated H2A.Z deposition. Mechanistically, IBD1's bromodomain recognizes histone acetylation, specifically H3K9/K14 di-acetylation, to recruit the SWR complex subunit ARP6, ensuring precise H2A.Z incorporation into chromatin. H3K9Q mutation and genetic disruption of IBD1, either by deletion or bromodomain mutation, significantly reduce H2A.Z occupancy at target loci. In contrast, disruption of IBD1 has little effect on H3K9/K14 acetylation levels, confirming the directional hierarchy of the acetylation-IBD1-H2A.Z regulatory axis. Intriguingly, perturbation of this axis, through IBD1 loss or bromodomain impairment, leads to widespread transcriptional upregulation, particularly at genes co-enriched for IBD1, H3K9/K14ac, and H2A.Z, with the strongest effects at hyperacetylated loci. This transcriptional imbalance coincides with reduced growth rates, underscoring the functional significance of IBD1-mediated H2A.Z deposition. Given that H2A.Z enrichment is classically correlated with transcriptional levels, this observation highlights a dual role for H2A.Z: sustaining basal transcription and constraining overactivation at highly active genes. Together, our findings define a novel regulatory mechanism in which IBD1 bridges acetyl-mark decoding with SWR-dependent H2A.Z deposition, establishing transcriptional homeostasis.
{"title":"Bromodomain protein IBD1 bridges histone acetylation and H2A.Z deposition to fine-tune transcription.","authors":"Zhe Zhang, Haicheng Li, Aili Ju, Fei Ye, Fan Wei, Yongqiang Liu, Junhua Niu, Hongzhen Jiang, Yuanyuan Wang, Shan Gao","doi":"10.1093/nar/gkag148","DOIUrl":"10.1093/nar/gkag148","url":null,"abstract":"<p><p>Eukaryotic gene expression is dynamically regulated through the interplay between histone modifications and chromatin remodeling, yet how these processes are coordinated remains incompletely understood. Here, we uncover IBD1 as a critical adaptor that bridges histone acetylation and SWR-mediated H2A.Z deposition. Mechanistically, IBD1's bromodomain recognizes histone acetylation, specifically H3K9/K14 di-acetylation, to recruit the SWR complex subunit ARP6, ensuring precise H2A.Z incorporation into chromatin. H3K9Q mutation and genetic disruption of IBD1, either by deletion or bromodomain mutation, significantly reduce H2A.Z occupancy at target loci. In contrast, disruption of IBD1 has little effect on H3K9/K14 acetylation levels, confirming the directional hierarchy of the acetylation-IBD1-H2A.Z regulatory axis. Intriguingly, perturbation of this axis, through IBD1 loss or bromodomain impairment, leads to widespread transcriptional upregulation, particularly at genes co-enriched for IBD1, H3K9/K14ac, and H2A.Z, with the strongest effects at hyperacetylated loci. This transcriptional imbalance coincides with reduced growth rates, underscoring the functional significance of IBD1-mediated H2A.Z deposition. Given that H2A.Z enrichment is classically correlated with transcriptional levels, this observation highlights a dual role for H2A.Z: sustaining basal transcription and constraining overactivation at highly active genes. Together, our findings define a novel regulatory mechanism in which IBD1 bridges acetyl-mark decoding with SWR-dependent H2A.Z deposition, establishing transcriptional homeostasis.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"54 4","pages":""},"PeriodicalIF":13.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12926916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147271603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}