Pub Date : 2026-01-27DOI: 10.1186/s13059-026-03950-1
Sara Bakić, Krešimir Friganović, Bryan Hooi, Mile Šikić
Nanopore sequencing enables real-time, long-read analysis by processing raw signals as they are produced. A key step, segmentation of signals into events, is typically handled algorithmically, struggling in noisy regions. We present Campolina, a first deep-learning framework for accurate segmentation of raw nanopore signals. Campolina uses a convolutional model to identify event boundaries and significantly outperforms the traditional Scrappie algorithm on R9.4.1 and R10.4.1 datasets. We introduce a comprehensive evaluation pipeline and show that Campolina aligns better with reference-guided ground-truth segmentation. We show that integrating Campolina segmentation into real-time frameworks, Sigmoni and RawHash2, improves their performance while maintaining time efficiency.
{"title":"Campolina: a deep neural framework for accurate segmentation of nanopore signals.","authors":"Sara Bakić, Krešimir Friganović, Bryan Hooi, Mile Šikić","doi":"10.1186/s13059-026-03950-1","DOIUrl":"https://doi.org/10.1186/s13059-026-03950-1","url":null,"abstract":"<p><p>Nanopore sequencing enables real-time, long-read analysis by processing raw signals as they are produced. A key step, segmentation of signals into events, is typically handled algorithmically, struggling in noisy regions. We present Campolina, a first deep-learning framework for accurate segmentation of raw nanopore signals. Campolina uses a convolutional model to identify event boundaries and significantly outperforms the traditional Scrappie algorithm on R9.4.1 and R10.4.1 datasets. We introduce a comprehensive evaluation pipeline and show that Campolina aligns better with reference-guided ground-truth segmentation. We show that integrating Campolina segmentation into real-time frameworks, Sigmoni and RawHash2, improves their performance while maintaining time efficiency.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1186/s13059-026-03953-y
Jiahui Zhu, Yuying Li, Cao Yu, Weixi Huang, Junming Chen, Xiaoshuang Liu, Ruiying Qin, Juan Li, Rongfang Xu, Pengcheng Wei
Background: IscB (Insertion sequences Cas9-like OrfB) represents a novel class of RNA-guided nucleases, approximately one-third the size of Cas9 proteins. Despite the limited natural efficiency in eukaryotic cells, recent advances have led to the engineering of several IscBs for mammalian genome editing.
Results: In this study, we screen and identify high-activity IscB variants for rice. A version of pIscB-v3, combining enOgeuIscB and ωRNA-v13, demonstrated superior mutagenesis efficiency compared to other systems. The average editing efficiency of pIscB-v3 is 17.61% from ten endogenous targets, and we obtain edited lines in up to 83.33% of T0 generation with 33.33% of homozygous and bi-allelic mutations. Further analysis reveals that pIscB-v3 exhibits high editing specificity and relaxed target-adjacent motif (TAM) compatibility in rice. Beyond gene knockout systems, we develop cytosine base editors (CBEs) and adenine base editors (ABEs) from pIscB-v3. We find that the ssDNA-targeting SCP1.201 family deaminase Sdd7 outperformed human APOBEC3A in IscB-CBEs for C-to-T conversions in rice. The Sdd7-nIscB achieves precise edits in 22.92% of lines on average, with a maximum frequency of 47.92%. Additionally, TadA8e-nIscB exhibits limited activity. However, fusing an extra copy of TadA-8e to either terminus of TadA8e-nIsc significantly enhances A-to-G conversions.
Conclusions: Collectively, our results demonstrate the robust capabilities of IscB to develop an efficient and versatile miniature plant genome editing toolkit to substantially facilitate crop breeding.
{"title":"Engineering hypercompact IscB nucleases for efficient and versatile genome editing in rice.","authors":"Jiahui Zhu, Yuying Li, Cao Yu, Weixi Huang, Junming Chen, Xiaoshuang Liu, Ruiying Qin, Juan Li, Rongfang Xu, Pengcheng Wei","doi":"10.1186/s13059-026-03953-y","DOIUrl":"https://doi.org/10.1186/s13059-026-03953-y","url":null,"abstract":"<p><strong>Background: </strong>IscB (Insertion sequences Cas9-like OrfB) represents a novel class of RNA-guided nucleases, approximately one-third the size of Cas9 proteins. Despite the limited natural efficiency in eukaryotic cells, recent advances have led to the engineering of several IscBs for mammalian genome editing.</p><p><strong>Results: </strong>In this study, we screen and identify high-activity IscB variants for rice. A version of pIscB-v3, combining enOgeuIscB and ωRNA-v13, demonstrated superior mutagenesis efficiency compared to other systems. The average editing efficiency of pIscB-v3 is 17.61% from ten endogenous targets, and we obtain edited lines in up to 83.33% of T0 generation with 33.33% of homozygous and bi-allelic mutations. Further analysis reveals that pIscB-v3 exhibits high editing specificity and relaxed target-adjacent motif (TAM) compatibility in rice. Beyond gene knockout systems, we develop cytosine base editors (CBEs) and adenine base editors (ABEs) from pIscB-v3. We find that the ssDNA-targeting SCP1.201 family deaminase Sdd7 outperformed human APOBEC3A in IscB-CBEs for C-to-T conversions in rice. The Sdd7-nIscB achieves precise edits in 22.92% of lines on average, with a maximum frequency of 47.92%. Additionally, TadA8e-nIscB exhibits limited activity. However, fusing an extra copy of TadA-8e to either terminus of TadA8e-nIsc significantly enhances A-to-G conversions.</p><p><strong>Conclusions: </strong>Collectively, our results demonstrate the robust capabilities of IscB to develop an efficient and versatile miniature plant genome editing toolkit to substantially facilitate crop breeding.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1186/s13059-025-03903-0
Ling-Dong Shi, Petar I Penev, Amos J Nissley, Dipti D Nayak, Rohan Sachdeva, Jamie H D Cate, Jillian F Banfield
Background: Processing of archaeal 16S and 23S rRNAs is believed to involve excision of individual rRNAs from polycistronic precursors, circularization of excised rRNAs, and re-linearization before the incorporation into ribosomes. However, all the knowledge is derived from several isolated species, leaving open the possibility that different processes may occur in other archaeal groups.
Results: Here, we investigate rRNAs from diverse and mostly uncultivated archaea. Sequencing of total cellular RNA from eight phylum-level lineages indicates that archaeal circular 23S rRNA transcript abundances vastly exceed those of linear counterparts, and linear versions are often undetectable. As the majority of rRNAs derive from mature ribosomes, the data suggest that ribosomes contain circular 23S rRNAs. Thus, we directly sequence RNA extracted from isolated ribosomes of a model archaeon, Methanosarcina acetivorans, and confirm that the 23S rRNAs in the ribosomes are circular. Structural modeling places the 5' and 3' ends of the linear precursors of archaeal 23S rRNAs in close proximity to form a GNRA tetraloop (in which N is A, C, G, or U and R is A or G), consistent with their existence as circular molecules. We also confirm the existence of circular 16S rRNA intermediates in transcriptomes of most archaea, yet a circular form is not evident in some distinct archaeal groups, suggesting that certain archaea do not circularize 16S rRNA during processing.
Conclusions: Our findings uncover unexpected variations in the processing required to generate mature rRNAs and the conformation of functional molecules in archaeal ribosomes.
{"title":"Circularization of 23S rRNA but not 16S rRNA within archaeal ribosomes.","authors":"Ling-Dong Shi, Petar I Penev, Amos J Nissley, Dipti D Nayak, Rohan Sachdeva, Jamie H D Cate, Jillian F Banfield","doi":"10.1186/s13059-025-03903-0","DOIUrl":"https://doi.org/10.1186/s13059-025-03903-0","url":null,"abstract":"<p><strong>Background: </strong>Processing of archaeal 16S and 23S rRNAs is believed to involve excision of individual rRNAs from polycistronic precursors, circularization of excised rRNAs, and re-linearization before the incorporation into ribosomes. However, all the knowledge is derived from several isolated species, leaving open the possibility that different processes may occur in other archaeal groups.</p><p><strong>Results: </strong>Here, we investigate rRNAs from diverse and mostly uncultivated archaea. Sequencing of total cellular RNA from eight phylum-level lineages indicates that archaeal circular 23S rRNA transcript abundances vastly exceed those of linear counterparts, and linear versions are often undetectable. As the majority of rRNAs derive from mature ribosomes, the data suggest that ribosomes contain circular 23S rRNAs. Thus, we directly sequence RNA extracted from isolated ribosomes of a model archaeon, Methanosarcina acetivorans, and confirm that the 23S rRNAs in the ribosomes are circular. Structural modeling places the 5' and 3' ends of the linear precursors of archaeal 23S rRNAs in close proximity to form a GNRA tetraloop (in which N is A, C, G, or U and R is A or G), consistent with their existence as circular molecules. We also confirm the existence of circular 16S rRNA intermediates in transcriptomes of most archaea, yet a circular form is not evident in some distinct archaeal groups, suggesting that certain archaea do not circularize 16S rRNA during processing.</p><p><strong>Conclusions: </strong>Our findings uncover unexpected variations in the processing required to generate mature rRNAs and the conformation of functional molecules in archaeal ribosomes.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The viticulture has long suffered from the downy mildew caused by Plasmopara viticola, a strictly obligate biotrophic oomycete. Numerous studies have been performed to reveal how grapevine defends against Plasmopara viticola, but they mainly investigate the plant defense responses on the whole tissue level, not on the cellular level.
Results: Here we employ single-cell RNA sequencing and spatial RNA sequencing to profile approximately 100,000 individual cells (~ 89,000 from scRNA-seq and ~ 11,000 from spRNA-seq), generating the first single-cell transcriptome atlas of grapevine leaves during Plasmopara viticola infection. This high-resolution atlas reveals the dynamic and distinct defense responses of plant cells at early stages of oomycete infection. Notably, we find that Plasmopara viticola reprograms the guard cell transcriptome to facilitate successful invasion, likely by altering the expression of ABA negative regulators and modulating a potassium channel regulatory pathway to influence stomatal movement.
Conclusions: Overall, our work reveals differential and dynamic responses of grapevine to the Plasmopara viticola infection at a single-cell level, providing valuable clues for dissecting the interaction between plants and oomycetes.
{"title":"Unveiling the early defense response dynamics in grapevines against Plasmopara viticola by single-cell transcriptomics.","authors":"Xiukun Yao, Zhizhuo Xu, Yasheng Xi, Xinyue He, Qifei Gao, Jiang Lu, Peining Fu","doi":"10.1186/s13059-025-03904-z","DOIUrl":"https://doi.org/10.1186/s13059-025-03904-z","url":null,"abstract":"<p><strong>Background: </strong>The viticulture has long suffered from the downy mildew caused by Plasmopara viticola, a strictly obligate biotrophic oomycete. Numerous studies have been performed to reveal how grapevine defends against Plasmopara viticola, but they mainly investigate the plant defense responses on the whole tissue level, not on the cellular level.</p><p><strong>Results: </strong>Here we employ single-cell RNA sequencing and spatial RNA sequencing to profile approximately 100,000 individual cells (~ 89,000 from scRNA-seq and ~ 11,000 from spRNA-seq), generating the first single-cell transcriptome atlas of grapevine leaves during Plasmopara viticola infection. This high-resolution atlas reveals the dynamic and distinct defense responses of plant cells at early stages of oomycete infection. Notably, we find that Plasmopara viticola reprograms the guard cell transcriptome to facilitate successful invasion, likely by altering the expression of ABA negative regulators and modulating a potassium channel regulatory pathway to influence stomatal movement.</p><p><strong>Conclusions: </strong>Overall, our work reveals differential and dynamic responses of grapevine to the Plasmopara viticola infection at a single-cell level, providing valuable clues for dissecting the interaction between plants and oomycetes.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146068020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1186/s13059-026-03934-1
Jerome Freudenberg, Jingyou Rao, Matthew K Howard, Christian Macdonald, Noah F Greenwald, Willow Coyote-Maestas, Harold Pimentel
Deep mutational scanning (DMS) coupled with fluorescence-activated cell sorting (FACS) provides a high-throughput method to link genetic variants with quantitative molecular phenotypes. Analysis of these experiments is challenging due to measurement variance and the multidimensional FACS readout. However, no statistical method has yet been developed to address these challenges. Here we present Lilace, a Bayesian statistical model to estimate variant effects with uncertainty quantification from FACS-based DMS experiments. We validate Lilace's performance and robustness using simulated data and apply it to OCT1 and Kir2.1 DMS datasets, demonstrating an improved false discovery rate while largely maintaining sensitivity.
{"title":"Accurate variant effect estimation in FACS-based deep mutational scanning data with Lilace.","authors":"Jerome Freudenberg, Jingyou Rao, Matthew K Howard, Christian Macdonald, Noah F Greenwald, Willow Coyote-Maestas, Harold Pimentel","doi":"10.1186/s13059-026-03934-1","DOIUrl":"10.1186/s13059-026-03934-1","url":null,"abstract":"<p><p>Deep mutational scanning (DMS) coupled with fluorescence-activated cell sorting (FACS) provides a high-throughput method to link genetic variants with quantitative molecular phenotypes. Analysis of these experiments is challenging due to measurement variance and the multidimensional FACS readout. However, no statistical method has yet been developed to address these challenges. Here we present Lilace, a Bayesian statistical model to estimate variant effects with uncertainty quantification from FACS-based DMS experiments. We validate Lilace's performance and robustness using simulated data and apply it to OCT1 and Kir2.1 DMS datasets, demonstrating an improved false discovery rate while largely maintaining sensitivity.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1186/s13059-025-03921-y
Wonji Kim, Xiaowei Hu, Kangjin Kim, Sung Chun, Peter Orchard, Dandi Qiao, Ingo Ruczinski, Aabida Saferali, Francois Aguet, Lucinda Antonacci-Fulton, Pallavi P Balte, Traci M Bartz, Wardatul Jannat Anamika, Xiaobo Zhou, JunYi Duan, Jennifer A Brody, Brian E Cade, Martha L Daviglus, Harshavadran Doddapaneni, Shannon Dugan-Perez, Susan K Dutcher, Christian D Frazar, Stacey B Gabriel, Sina A Gharib, Namrata Gupta, Brian D Hobbs, Silva Kasela, Laura R Loehr, Ginger A Metcalf, Donna M Muzny, Elizabeth C Oelsner, Laura J Rasmussen-Torvik, Colleen M Sitlani, Joshua Smith, Tamar Sofer, Hanfei Xu, Bing Yu, David Zhang, John Ziniti, R Graham Barr, April P Carson, Myriam Fornage, Lifang Hou, Ravi Kalhan, Robert Kaplan, Tuuli Lappalainen, Stephanie J London, Alanna C Morrison, George T O'Connor, Bruce M Psaty, Laura M Raffield, Susan Redline, Stephen S Rich, Jerome I Rotter, Edwin K Silverman, Ani Manichaikul, Michael H Cho
Background: Whole genome sequence (WGS) data in multi-ancestry samples supports discovery of low-frequency or population-specific genetic variants associated with chronic obstructive pulmonary disease (COPD) and lung function.
Results: We performed single variant, structural variant, and gene-based analysis of pulmonary function (FEV1, FVC and FEV1/FVC) and COPD case-control status in 44,287 multi-ancestry participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We validated findings using the UK Biobank and assessed implicated genes using lung single-cell RNA-seq (scRNA-seq) data sets. Applying a genome-wide significance threshold (P < 5 × 10-9), we replicated known loci and identified novel associations near LY86, MAGI1, GRK7, and LINC02668. Colocalization with gene expression quantitative trait loci (eQTL) from the Lung Tissue Research Consortium highlighted known candidate genes including ADAM19, THSD4, C4B, and PSMA4, which were not identified through other eQTL sources. Multi-ancestry analysis improved fine-mapping resolution (e.g., HTR4 and RIN3). Gene-based analysis identified and replicated HMCN1. In human lung scRNA-seq data sets, lung epithelial cells and immune cell types showed enriched expression, while fibroblasts showed higher expression for HMCN1. CRISPR targeting HMCN1 in IMR90 demonstrated reduced expression of collagen genes.
Conclusions: Large-scale multi-ancestry WGS analysis improves variant discovery and fine-mapping resolution for lung function and COPD and highlights biologically relevant genes and pathways.
背景:多祖先样本的全基因组序列(WGS)数据支持发现与慢性阻塞性肺疾病(COPD)和肺功能相关的低频或人群特异性遗传变异。结果:我们对来自NHLBI Trans-Omics for Precision Medicine (TOPMed)项目的44,287名多血统参与者进行了单变异、结构变异和基于基因的肺功能(FEV1、FVC和FEV1/FVC)和COPD病例对照状态的分析。我们使用UK Biobank验证了研究结果,并使用肺单细胞RNA-seq (scRNA-seq)数据集评估了相关基因。应用全基因组显著性阈值(P -9),我们复制了已知的位点,并在LY86、MAGI1、GRK7和LINC02668附近发现了新的关联。与来自肺组织研究联盟的基因表达数量性状位点(eQTL)共定位突出了已知的候选基因,包括ADAM19、THSD4、C4B和PSMA4,这些基因未通过其他eQTL来源鉴定。多祖先分析提高了精细制图的分辨率(例如,HTR4和RIN3)。基于基因的分析鉴定并复制了HMCN1。在人肺scRNA-seq数据集中,肺上皮细胞和免疫细胞类型表达丰富,而成纤维细胞表达HMCN1较高。在IMR90中靶向HMCN1的CRISPR显示胶原基因表达降低。结论:大规模多祖先WGS分析提高了肺功能和COPD的变异发现和精细定位分辨率,并突出了生物学相关基因和途径。
{"title":"Whole genome sequence analysis of pulmonary function and COPD in 44,287 multi-ancestry participants.","authors":"Wonji Kim, Xiaowei Hu, Kangjin Kim, Sung Chun, Peter Orchard, Dandi Qiao, Ingo Ruczinski, Aabida Saferali, Francois Aguet, Lucinda Antonacci-Fulton, Pallavi P Balte, Traci M Bartz, Wardatul Jannat Anamika, Xiaobo Zhou, JunYi Duan, Jennifer A Brody, Brian E Cade, Martha L Daviglus, Harshavadran Doddapaneni, Shannon Dugan-Perez, Susan K Dutcher, Christian D Frazar, Stacey B Gabriel, Sina A Gharib, Namrata Gupta, Brian D Hobbs, Silva Kasela, Laura R Loehr, Ginger A Metcalf, Donna M Muzny, Elizabeth C Oelsner, Laura J Rasmussen-Torvik, Colleen M Sitlani, Joshua Smith, Tamar Sofer, Hanfei Xu, Bing Yu, David Zhang, John Ziniti, R Graham Barr, April P Carson, Myriam Fornage, Lifang Hou, Ravi Kalhan, Robert Kaplan, Tuuli Lappalainen, Stephanie J London, Alanna C Morrison, George T O'Connor, Bruce M Psaty, Laura M Raffield, Susan Redline, Stephen S Rich, Jerome I Rotter, Edwin K Silverman, Ani Manichaikul, Michael H Cho","doi":"10.1186/s13059-025-03921-y","DOIUrl":"10.1186/s13059-025-03921-y","url":null,"abstract":"<p><strong>Background: </strong>Whole genome sequence (WGS) data in multi-ancestry samples supports discovery of low-frequency or population-specific genetic variants associated with chronic obstructive pulmonary disease (COPD) and lung function.</p><p><strong>Results: </strong>We performed single variant, structural variant, and gene-based analysis of pulmonary function (FEV<sub>1</sub>, FVC and FEV<sub>1</sub>/FVC) and COPD case-control status in 44,287 multi-ancestry participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We validated findings using the UK Biobank and assessed implicated genes using lung single-cell RNA-seq (scRNA-seq) data sets. Applying a genome-wide significance threshold (P < 5 × 10<sup>-9</sup>), we replicated known loci and identified novel associations near LY86, MAGI1, GRK7, and LINC02668. Colocalization with gene expression quantitative trait loci (eQTL) from the Lung Tissue Research Consortium highlighted known candidate genes including ADAM19, THSD4, C4B, and PSMA4, which were not identified through other eQTL sources. Multi-ancestry analysis improved fine-mapping resolution (e.g., HTR4 and RIN3). Gene-based analysis identified and replicated HMCN1. In human lung scRNA-seq data sets, lung epithelial cells and immune cell types showed enriched expression, while fibroblasts showed higher expression for HMCN1. CRISPR targeting HMCN1 in IMR90 demonstrated reduced expression of collagen genes.</p><p><strong>Conclusions: </strong>Large-scale multi-ancestry WGS analysis improves variant discovery and fine-mapping resolution for lung function and COPD and highlights biologically relevant genes and pathways.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"28"},"PeriodicalIF":12.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1186/s13059-025-03874-2
Bingxian Xu, Rosemary Braun
{"title":"VIST: variational inference for single cell time series.","authors":"Bingxian Xu, Rosemary Braun","doi":"10.1186/s13059-025-03874-2","DOIUrl":"10.1186/s13059-025-03874-2","url":null,"abstract":"","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"29"},"PeriodicalIF":12.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12892444/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1186/s13059-025-03910-1
Denis Kleverov, Ekaterina Aladyeva, Alexey Serdyukov, Maxim N Artyomov
Background: Non-negative matrix factorization is a powerful linear algebra tool used in multiple areas of data analysis, including computational biology. Despite numerous optimization methods devised for non-negative matrix factorization, our understanding of the inherent topological structure within factorizable matrices remains limited.
Results: This study reveals the topological properties of linear mixture data, leading to a remarkable reduction of the non-negative matrix factorization optimization problem to a search for K(K-1) variables, where K represents the number of pure components, regardless of the initial matrix size. This is achieved by revealing complementary simplex structures existing in both feature and sample spaces and leveraging the Sinkhorn transformation to find the relationship between these simplexes. We validate this approach in the context of an unconstrained mixed images scenario and achieve a significant improvement in decomposition accuracy. Furthermore, we successfully applied the proposed approach in the biological context of bulk RNA-seq gene expression deconvolution.
Conclusions: The Dual Simplex unified analytical framework improves robustness to noise and enhances optimization stability, enabling accurate recovery of component proportions and expression profiles. Importantly, the framework naturally accommodates both reference-free and marker-based deconvolution settings, providing a general and efficient solution for analyzing complex biological mixtures such as bulk RNA-seq and single-cell derived data.
{"title":"Non-negative matrix factorization and deconvolution as a dual simplex problem.","authors":"Denis Kleverov, Ekaterina Aladyeva, Alexey Serdyukov, Maxim N Artyomov","doi":"10.1186/s13059-025-03910-1","DOIUrl":"10.1186/s13059-025-03910-1","url":null,"abstract":"<p><strong>Background: </strong>Non-negative matrix factorization is a powerful linear algebra tool used in multiple areas of data analysis, including computational biology. Despite numerous optimization methods devised for non-negative matrix factorization, our understanding of the inherent topological structure within factorizable matrices remains limited.</p><p><strong>Results: </strong>This study reveals the topological properties of linear mixture data, leading to a remarkable reduction of the non-negative matrix factorization optimization problem to a search for K(K-1) variables, where K represents the number of pure components, regardless of the initial matrix size. This is achieved by revealing complementary simplex structures existing in both feature and sample spaces and leveraging the Sinkhorn transformation to find the relationship between these simplexes. We validate this approach in the context of an unconstrained mixed images scenario and achieve a significant improvement in decomposition accuracy. Furthermore, we successfully applied the proposed approach in the biological context of bulk RNA-seq gene expression deconvolution.</p><p><strong>Conclusions: </strong>The Dual Simplex unified analytical framework improves robustness to noise and enhances optimization stability, enabling accurate recovery of component proportions and expression profiles. Importantly, the framework naturally accommodates both reference-free and marker-based deconvolution settings, providing a general and efficient solution for analyzing complex biological mixtures such as bulk RNA-seq and single-cell derived data.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"25"},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888666/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The NONO protein plays a crucial role in RNA metabolism and DNA repair. It undergoes various post-translational modifications, including phosphorylation, ubiquitination, acetylation and methylation, all of which regulate its diverse cellular functions. However, the role of O-GlcNAcylation in regulating NONO's function in DNA damage repair is not well understood.
Results: This study demonstrates that O-GlcNAcylation of NONO at Serine 147 (Ser147) is essential for its recruitment to DNA damage sites. Specifically, O-GlcNAcylation at Ser147 reduces NONO ubiquitination and stabilizes its interaction with SFPQ, regulating the alternative splicing of the histone methyltransferase SETMAR. A deficiency in O-GlcNAcylation at Ser 147 impairs NONO's binding to SETMAR pre-mRNA, leading to an increased production of the truncated isoform of SETMAR (SETMAR-S). The resulting SETMAR-S suppresses the generation of H3K36me2 and inhibits the recruitment of Ku70 at DNA damage sites, ultimately impairing non-homologous end joining (NHEJ) repair. Furthermore, the disruption of O-GlcNAcylation at Ser147 sensitizes liver cancer cells to ionizing radiation treatment, both in vitro and in vivo.
Conclusions: O-GlcNAcylation at Ser 147 of NONO mediates the alternative splicing of SETMAR and facilitates NHEJ repair. Collectively, our findings suggest that targeting NONO O-GlcNAcylation may provide a novel therapeutic strategy for cancer treatment.
{"title":"O-GlcNAcylation of NONO mediates alternative splicing of SETMAR and facilitates NHEJ repair.","authors":"Mengyuan Li, Huanna Tian, Ziyi Zhou, Yuhan Jiang, Xiaomeng Guo, Weijie Qin, Hongbing Zhang, Yajie Jiao, Shuai Guo, Chen Wu","doi":"10.1186/s13059-026-03930-5","DOIUrl":"10.1186/s13059-026-03930-5","url":null,"abstract":"<p><strong>Background: </strong>The NONO protein plays a crucial role in RNA metabolism and DNA repair. It undergoes various post-translational modifications, including phosphorylation, ubiquitination, acetylation and methylation, all of which regulate its diverse cellular functions. However, the role of O-GlcNAcylation in regulating NONO's function in DNA damage repair is not well understood.</p><p><strong>Results: </strong>This study demonstrates that O-GlcNAcylation of NONO at Serine 147 (Ser147) is essential for its recruitment to DNA damage sites. Specifically, O-GlcNAcylation at Ser147 reduces NONO ubiquitination and stabilizes its interaction with SFPQ, regulating the alternative splicing of the histone methyltransferase SETMAR. A deficiency in O-GlcNAcylation at Ser 147 impairs NONO's binding to SETMAR pre-mRNA, leading to an increased production of the truncated isoform of SETMAR (SETMAR-S). The resulting SETMAR-S suppresses the generation of H3K36me2 and inhibits the recruitment of Ku70 at DNA damage sites, ultimately impairing non-homologous end joining (NHEJ) repair. Furthermore, the disruption of O-GlcNAcylation at Ser147 sensitizes liver cancer cells to ionizing radiation treatment, both in vitro and in vivo.</p><p><strong>Conclusions: </strong>O-GlcNAcylation at Ser 147 of NONO mediates the alternative splicing of SETMAR and facilitates NHEJ repair. Collectively, our findings suggest that targeting NONO O-GlcNAcylation may provide a novel therapeutic strategy for cancer treatment.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"26"},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888201/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1186/s13059-026-03931-4
Yahui Xue, Lei Zhou, Yue Zhuo, Weining Li, Sijia Ma, Heng Du, Wanying Li, Jicai Jiang, Jian-Feng Liu
The increasing availability of multi-omics data is promising in enhancing genomic prediction in breeding and human genetics. However, integrating multi-omics data into genomic prediction models remains challenging due to complex relationships between omics layers and phenotypic outcomes. We propose Fusion Similarity Best Linear Unbiased Prediction (FSBLUP), a novel strategy that integrates genomic and intermediate omics data using a unified similarity matrix approach. FSBLUP systematically estimates how different omics layers contribute to phenotypic variation via machine-learning-optimized parameters that capture underlying genetic architecture of complex traits. FSBLUP demonstrates greater predictive accuracy than existing methods, as validated through theoretical and practical evaluations.
{"title":"FSBLUP: a novel strategy of fusion similarity matrix construction via optimally integrating intermediate omics data to enhance genomic prediction.","authors":"Yahui Xue, Lei Zhou, Yue Zhuo, Weining Li, Sijia Ma, Heng Du, Wanying Li, Jicai Jiang, Jian-Feng Liu","doi":"10.1186/s13059-026-03931-4","DOIUrl":"10.1186/s13059-026-03931-4","url":null,"abstract":"<p><p>The increasing availability of multi-omics data is promising in enhancing genomic prediction in breeding and human genetics. However, integrating multi-omics data into genomic prediction models remains challenging due to complex relationships between omics layers and phenotypic outcomes. We propose Fusion Similarity Best Linear Unbiased Prediction (FSBLUP), a novel strategy that integrates genomic and intermediate omics data using a unified similarity matrix approach. FSBLUP systematically estimates how different omics layers contribute to phenotypic variation via machine-learning-optimized parameters that capture underlying genetic architecture of complex traits. FSBLUP demonstrates greater predictive accuracy than existing methods, as validated through theoretical and practical evaluations.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"27"},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888590/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}