Pub Date : 2026-01-15DOI: 10.1186/s13059-025-03921-y
Wonji Kim, Xiaowei Hu, Kangjin Kim, Sung Chun, Peter Orchard, Dandi Qiao, Ingo Ruczinski, Aabida Saferali, Francois Aguet, Lucinda Antonacci-Fulton, Pallavi P Balte, Traci M Bartz, Wardatul Jannat Anamika, Xiaobo Zhou, JunYi Duan, Jennifer A Brody, Brian E Cade, Martha L Daviglus, Harshavadran Doddapaneni, Shannon Dugan-Perez, Susan K Dutcher, Christian D Frazar, Stacey B Gabriel, Sina A Gharib, Namrata Gupta, Brian D Hobbs, Silva Kasela, Laura R Loehr, Ginger A Metcalf, Donna M Muzny, Elizabeth C Oelsner, Laura J Rasmussen-Torvik, Colleen M Sitlani, Joshua Smith, Tamar Sofer, Hanfei Xu, Bing Yu, David Zhang, John Ziniti, R Graham Barr, April P Carson, Myriam Fornage, Lifang Hou, Ravi Kalhan, Robert Kaplan, Tuuli Lappalainen, Stephanie J London, Alanna C Morrison, George T O'Connor, Bruce M Psaty, Laura M Raffield, Susan Redline, Stephen S Rich, Jerome I Rotter, Edwin K Silverman, Ani Manichaikul, Michael H Cho
Background: Whole genome sequence (WGS) data in multi-ancestry samples supports discovery of low-frequency or population-specific genetic variants associated with chronic obstructive pulmonary disease (COPD) and lung function.
Results: We performed single variant, structural variant, and gene-based analysis of pulmonary function (FEV1, FVC and FEV1/FVC) and COPD case-control status in 44,287 multi-ancestry participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We validated findings using the UK Biobank and assessed implicated genes using lung single-cell RNA-seq (scRNA-seq) data sets. Applying a genome-wide significance threshold (P < 5 × 10-9), we replicated known loci and identified novel associations near LY86, MAGI1, GRK7, and LINC02668. Colocalization with gene expression quantitative trait loci (eQTL) from the Lung Tissue Research Consortium highlighted known candidate genes including ADAM19, THSD4, C4B, and PSMA4, which were not identified through other eQTL sources. Multi-ancestry analysis improved fine-mapping resolution (e.g., HTR4 and RIN3). Gene-based analysis identified and replicated HMCN1. In human lung scRNA-seq data sets, lung epithelial cells and immune cell types showed enriched expression, while fibroblasts showed higher expression for HMCN1. CRISPR targeting HMCN1 in IMR90 demonstrated reduced expression of collagen genes.
Conclusions: Large-scale multi-ancestry WGS analysis improves variant discovery and fine-mapping resolution for lung function and COPD and highlights biologically relevant genes and pathways.
背景:多祖先样本的全基因组序列(WGS)数据支持发现与慢性阻塞性肺疾病(COPD)和肺功能相关的低频或人群特异性遗传变异。结果:我们对来自NHLBI Trans-Omics for Precision Medicine (TOPMed)项目的44,287名多血统参与者进行了单变异、结构变异和基于基因的肺功能(FEV1、FVC和FEV1/FVC)和COPD病例对照状态的分析。我们使用UK Biobank验证了研究结果,并使用肺单细胞RNA-seq (scRNA-seq)数据集评估了相关基因。应用全基因组显著性阈值(P -9),我们复制了已知的位点,并在LY86、MAGI1、GRK7和LINC02668附近发现了新的关联。与来自肺组织研究联盟的基因表达数量性状位点(eQTL)共定位突出了已知的候选基因,包括ADAM19、THSD4、C4B和PSMA4,这些基因未通过其他eQTL来源鉴定。多祖先分析提高了精细制图的分辨率(例如,HTR4和RIN3)。基于基因的分析鉴定并复制了HMCN1。在人肺scRNA-seq数据集中,肺上皮细胞和免疫细胞类型表达丰富,而成纤维细胞表达HMCN1较高。在IMR90中靶向HMCN1的CRISPR显示胶原基因表达降低。结论:大规模多祖先WGS分析提高了肺功能和COPD的变异发现和精细定位分辨率,并突出了生物学相关基因和途径。
{"title":"Whole genome sequence analysis of pulmonary function and COPD in 44,287 multi-ancestry participants.","authors":"Wonji Kim, Xiaowei Hu, Kangjin Kim, Sung Chun, Peter Orchard, Dandi Qiao, Ingo Ruczinski, Aabida Saferali, Francois Aguet, Lucinda Antonacci-Fulton, Pallavi P Balte, Traci M Bartz, Wardatul Jannat Anamika, Xiaobo Zhou, JunYi Duan, Jennifer A Brody, Brian E Cade, Martha L Daviglus, Harshavadran Doddapaneni, Shannon Dugan-Perez, Susan K Dutcher, Christian D Frazar, Stacey B Gabriel, Sina A Gharib, Namrata Gupta, Brian D Hobbs, Silva Kasela, Laura R Loehr, Ginger A Metcalf, Donna M Muzny, Elizabeth C Oelsner, Laura J Rasmussen-Torvik, Colleen M Sitlani, Joshua Smith, Tamar Sofer, Hanfei Xu, Bing Yu, David Zhang, John Ziniti, R Graham Barr, April P Carson, Myriam Fornage, Lifang Hou, Ravi Kalhan, Robert Kaplan, Tuuli Lappalainen, Stephanie J London, Alanna C Morrison, George T O'Connor, Bruce M Psaty, Laura M Raffield, Susan Redline, Stephen S Rich, Jerome I Rotter, Edwin K Silverman, Ani Manichaikul, Michael H Cho","doi":"10.1186/s13059-025-03921-y","DOIUrl":"https://doi.org/10.1186/s13059-025-03921-y","url":null,"abstract":"<p><strong>Background: </strong>Whole genome sequence (WGS) data in multi-ancestry samples supports discovery of low-frequency or population-specific genetic variants associated with chronic obstructive pulmonary disease (COPD) and lung function.</p><p><strong>Results: </strong>We performed single variant, structural variant, and gene-based analysis of pulmonary function (FEV<sub>1</sub>, FVC and FEV<sub>1</sub>/FVC) and COPD case-control status in 44,287 multi-ancestry participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We validated findings using the UK Biobank and assessed implicated genes using lung single-cell RNA-seq (scRNA-seq) data sets. Applying a genome-wide significance threshold (P < 5 × 10<sup>-9</sup>), we replicated known loci and identified novel associations near LY86, MAGI1, GRK7, and LINC02668. Colocalization with gene expression quantitative trait loci (eQTL) from the Lung Tissue Research Consortium highlighted known candidate genes including ADAM19, THSD4, C4B, and PSMA4, which were not identified through other eQTL sources. Multi-ancestry analysis improved fine-mapping resolution (e.g., HTR4 and RIN3). Gene-based analysis identified and replicated HMCN1. In human lung scRNA-seq data sets, lung epithelial cells and immune cell types showed enriched expression, while fibroblasts showed higher expression for HMCN1. CRISPR targeting HMCN1 in IMR90 demonstrated reduced expression of collagen genes.</p><p><strong>Conclusions: </strong>Large-scale multi-ancestry WGS analysis improves variant discovery and fine-mapping resolution for lung function and COPD and highlights biologically relevant genes and pathways.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1186/s13059-025-03874-2
Bingxian Xu, Rosemary Braun
{"title":"VIST: variational inference for single cell time series.","authors":"Bingxian Xu, Rosemary Braun","doi":"10.1186/s13059-025-03874-2","DOIUrl":"10.1186/s13059-025-03874-2","url":null,"abstract":"","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1186/s13059-025-03910-1
Denis Kleverov, Ekaterina Aladyeva, Alexey Serdyukov, Maxim N Artyomov
Background: Non-negative matrix factorization is a powerful linear algebra tool used in multiple areas of data analysis, including computational biology. Despite numerous optimization methods devised for non-negative matrix factorization, our understanding of the inherent topological structure within factorizable matrices remains limited.
Results: This study reveals the topological properties of linear mixture data, leading to a remarkable reduction of the non-negative matrix factorization optimization problem to a search for K(K-1) variables, where K represents the number of pure components, regardless of the initial matrix size. This is achieved by revealing complementary simplex structures existing in both feature and sample spaces and leveraging the Sinkhorn transformation to find the relationship between these simplexes. We validate this approach in the context of an unconstrained mixed images scenario and achieve a significant improvement in decomposition accuracy. Furthermore, we successfully applied the proposed approach in the biological context of bulk RNA-seq gene expression deconvolution.
Conclusions: The Dual Simplex unified analytical framework improves robustness to noise and enhances optimization stability, enabling accurate recovery of component proportions and expression profiles. Importantly, the framework naturally accommodates both reference-free and marker-based deconvolution settings, providing a general and efficient solution for analyzing complex biological mixtures such as bulk RNA-seq and single-cell derived data.
{"title":"Non-negative matrix factorization and deconvolution as a dual simplex problem.","authors":"Denis Kleverov, Ekaterina Aladyeva, Alexey Serdyukov, Maxim N Artyomov","doi":"10.1186/s13059-025-03910-1","DOIUrl":"https://doi.org/10.1186/s13059-025-03910-1","url":null,"abstract":"<p><strong>Background: </strong>Non-negative matrix factorization is a powerful linear algebra tool used in multiple areas of data analysis, including computational biology. Despite numerous optimization methods devised for non-negative matrix factorization, our understanding of the inherent topological structure within factorizable matrices remains limited.</p><p><strong>Results: </strong>This study reveals the topological properties of linear mixture data, leading to a remarkable reduction of the non-negative matrix factorization optimization problem to a search for K(K-1) variables, where K represents the number of pure components, regardless of the initial matrix size. This is achieved by revealing complementary simplex structures existing in both feature and sample spaces and leveraging the Sinkhorn transformation to find the relationship between these simplexes. We validate this approach in the context of an unconstrained mixed images scenario and achieve a significant improvement in decomposition accuracy. Furthermore, we successfully applied the proposed approach in the biological context of bulk RNA-seq gene expression deconvolution.</p><p><strong>Conclusions: </strong>The Dual Simplex unified analytical framework improves robustness to noise and enhances optimization stability, enabling accurate recovery of component proportions and expression profiles. Importantly, the framework naturally accommodates both reference-free and marker-based deconvolution settings, providing a general and efficient solution for analyzing complex biological mixtures such as bulk RNA-seq and single-cell derived data.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The NONO protein plays a crucial role in RNA metabolism and DNA repair. It undergoes various post-translational modifications, including phosphorylation, ubiquitination, acetylation and methylation, all of which regulate its diverse cellular functions. However, the role of O-GlcNAcylation in regulating NONO's function in DNA damage repair is not well understood.
Results: This study demonstrates that O-GlcNAcylation of NONO at Serine 147 (Ser147) is essential for its recruitment to DNA damage sites. Specifically, O-GlcNAcylation at Ser147 reduces NONO ubiquitination and stabilizes its interaction with SFPQ, regulating the alternative splicing of the histone methyltransferase SETMAR. A deficiency in O-GlcNAcylation at Ser 147 impairs NONO's binding to SETMAR pre-mRNA, leading to an increased production of the truncated isoform of SETMAR (SETMAR-S). The resulting SETMAR-S suppresses the generation of H3K36me2 and inhibits the recruitment of Ku70 at DNA damage sites, ultimately impairing non-homologous end joining (NHEJ) repair. Furthermore, the disruption of O-GlcNAcylation at Ser147 sensitizes liver cancer cells to ionizing radiation treatment, both in vitro and in vivo.
Conclusions: O-GlcNAcylation at Ser 147 of NONO mediates the alternative splicing of SETMAR and facilitates NHEJ repair. Collectively, our findings suggest that targeting NONO O-GlcNAcylation may provide a novel therapeutic strategy for cancer treatment.
{"title":"O-GlcNAcylation of NONO mediates alternative splicing of SETMAR and facilitates NHEJ repair.","authors":"Mengyuan Li, Huanna Tian, Ziyi Zhou, Yuhan Jiang, Xiaomeng Guo, Weijie Qin, Hongbing Zhang, Yajie Jiao, Shuai Guo, Chen Wu","doi":"10.1186/s13059-026-03930-5","DOIUrl":"https://doi.org/10.1186/s13059-026-03930-5","url":null,"abstract":"<p><strong>Background: </strong>The NONO protein plays a crucial role in RNA metabolism and DNA repair. It undergoes various post-translational modifications, including phosphorylation, ubiquitination, acetylation and methylation, all of which regulate its diverse cellular functions. However, the role of O-GlcNAcylation in regulating NONO's function in DNA damage repair is not well understood.</p><p><strong>Results: </strong>This study demonstrates that O-GlcNAcylation of NONO at Serine 147 (Ser147) is essential for its recruitment to DNA damage sites. Specifically, O-GlcNAcylation at Ser147 reduces NONO ubiquitination and stabilizes its interaction with SFPQ, regulating the alternative splicing of the histone methyltransferase SETMAR. A deficiency in O-GlcNAcylation at Ser 147 impairs NONO's binding to SETMAR pre-mRNA, leading to an increased production of the truncated isoform of SETMAR (SETMAR-S). The resulting SETMAR-S suppresses the generation of H3K36me2 and inhibits the recruitment of Ku70 at DNA damage sites, ultimately impairing non-homologous end joining (NHEJ) repair. Furthermore, the disruption of O-GlcNAcylation at Ser147 sensitizes liver cancer cells to ionizing radiation treatment, both in vitro and in vivo.</p><p><strong>Conclusions: </strong>O-GlcNAcylation at Ser 147 of NONO mediates the alternative splicing of SETMAR and facilitates NHEJ repair. Collectively, our findings suggest that targeting NONO O-GlcNAcylation may provide a novel therapeutic strategy for cancer treatment.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1186/s13059-026-03931-4
Yahui Xue, Lei Zhou, Yue Zhuo, Weining Li, Sijia Ma, Heng Du, Wanying Li, Jicai Jiang, Jian-Feng Liu
The increasing availability of multi-omics data is promising in enhancing genomic prediction in breeding and human genetics. However, integrating multi-omics data into genomic prediction models remains challenging due to complex relationships between omics layers and phenotypic outcomes. We propose Fusion Similarity Best Linear Unbiased Prediction (FSBLUP), a novel strategy that integrates genomic and intermediate omics data using a unified similarity matrix approach. FSBLUP systematically estimates how different omics layers contribute to phenotypic variation via machine-learning-optimized parameters that capture underlying genetic architecture of complex traits. FSBLUP demonstrates greater predictive accuracy than existing methods, as validated through theoretical and practical evaluations.
{"title":"FSBLUP: a novel strategy of fusion similarity matrix construction via optimally integrating intermediate omics data to enhance genomic prediction.","authors":"Yahui Xue, Lei Zhou, Yue Zhuo, Weining Li, Sijia Ma, Heng Du, Wanying Li, Jicai Jiang, Jian-Feng Liu","doi":"10.1186/s13059-026-03931-4","DOIUrl":"https://doi.org/10.1186/s13059-026-03931-4","url":null,"abstract":"<p><p>The increasing availability of multi-omics data is promising in enhancing genomic prediction in breeding and human genetics. However, integrating multi-omics data into genomic prediction models remains challenging due to complex relationships between omics layers and phenotypic outcomes. We propose Fusion Similarity Best Linear Unbiased Prediction (FSBLUP), a novel strategy that integrates genomic and intermediate omics data using a unified similarity matrix approach. FSBLUP systematically estimates how different omics layers contribute to phenotypic variation via machine-learning-optimized parameters that capture underlying genetic architecture of complex traits. FSBLUP demonstrates greater predictive accuracy than existing methods, as validated through theoretical and practical evaluations.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1186/s13059-025-03929-4
Tongxuan Lv, Quan Han, Yilin Li, Chen Liang, Zhonghao Ruan, Haoyu Chao, Ming Chen, Dijun Chen
Background: The regulation of gene expression in plants is governed by complex interactions between cis-regulatory elements and epigenetic modifications such as histone marks. While deep learning models have achieved success in predicting regulatory features from DNA sequence, their cross-species generalizability in plants remains largely unexplored.
Results: We systematically evaluate the ability of deep learning models to predict histone modifications across plant species using a multi-stage framework based on the Sei architecture. We train species-specific models for Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and maize (Zea mays), achieving high within-species predictive performance and strong agreement between predictions and experimental ChIP-seq profiles. However, cross-species predictions show reduced performance with increasing phylogenetic distance, highlighting limited model transferability between monocots and dicots. To improve generalization, we construct a Poaceae family-level model by jointly training on rice and maize, and an Arabidopsis-trained model based solely on Arabidopsis. These models demonstrate robust predictive power in completely unprofiled species that are not used in training set, highlighting the model's adaptability to novel plant genomes based solely on conserved regulatory syntax. In contrast, cross-family models produce less consistent results, with reliable performance only in species sharing conserved regulatory features. We also develop an easy-to-use pipeline that predicts genome-wide chromatin signals directly from DNA sequences.
Conclusions: Our findings demonstrate that phylogenetically informed model training significantly improves cross-species epigenomic prediction, offering a scalable computational strategy for functional annotation in non-model and agriculturally important plants.
{"title":"Cross-species prediction of histone modifications in plants via deep learning.","authors":"Tongxuan Lv, Quan Han, Yilin Li, Chen Liang, Zhonghao Ruan, Haoyu Chao, Ming Chen, Dijun Chen","doi":"10.1186/s13059-025-03929-4","DOIUrl":"https://doi.org/10.1186/s13059-025-03929-4","url":null,"abstract":"<p><strong>Background: </strong>The regulation of gene expression in plants is governed by complex interactions between cis-regulatory elements and epigenetic modifications such as histone marks. While deep learning models have achieved success in predicting regulatory features from DNA sequence, their cross-species generalizability in plants remains largely unexplored.</p><p><strong>Results: </strong>We systematically evaluate the ability of deep learning models to predict histone modifications across plant species using a multi-stage framework based on the Sei architecture. We train species-specific models for Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and maize (Zea mays), achieving high within-species predictive performance and strong agreement between predictions and experimental ChIP-seq profiles. However, cross-species predictions show reduced performance with increasing phylogenetic distance, highlighting limited model transferability between monocots and dicots. To improve generalization, we construct a Poaceae family-level model by jointly training on rice and maize, and an Arabidopsis-trained model based solely on Arabidopsis. These models demonstrate robust predictive power in completely unprofiled species that are not used in training set, highlighting the model's adaptability to novel plant genomes based solely on conserved regulatory syntax. In contrast, cross-family models produce less consistent results, with reliable performance only in species sharing conserved regulatory features. We also develop an easy-to-use pipeline that predicts genome-wide chromatin signals directly from DNA sequences.</p><p><strong>Conclusions: </strong>Our findings demonstrate that phylogenetically informed model training significantly improves cross-species epigenomic prediction, offering a scalable computational strategy for functional annotation in non-model and agriculturally important plants.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":12.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145946704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Root rot disease caused by fungal pathogens of wine grapevines poses a serious threat to their growth and results in a substantial economic impact on grape industry. The rhizosphere microbiome recruited to plants is critical for mitigating soil-borne pathogens. However, how beneficial microbes influence disease resistance remains unclear.
Results: We investigate the composition and gene functions of microorganisms in wine grapevines with root rot disease and healthy controls by amplicon and metagenomic sequencing. We use culturomics and in vivo experiments to verify the pathogen and beneficial strains to improve plant health. We find that root rot disease in grapevines significantly affects rhizosphere microbiome diversity and composition. The microbial interkingdom network indicates that the disease destabilizes the bacteria-fungi co-occurrence network. We find that plants recruit the potentially beneficial bacteria Pseudomonas, Bacillus and Streptomyces in healthy rhizosphere soil. By culturomics, we confirm that Fusarium solani is the main pathogen causing root rot disease. We further observe that these three key beneficial bacteria from the co-occurrence networks enhance the resistance of grapevines to pathogens. Furthermore, metagenomic analysis reveals that beneficial bacterial strains suppress pathogens by enriching potential functional genes in pathways involved in disease resistance.
Conclusions: Our findings highlight the critical role of disease resistance pathways of potentially beneficial microorganisms in fighting disease and supporting plant health, offering new insight for the exploration of beneficial microbial resources and providing a basis for the development of biological control of grape root rot disease.
{"title":"Core microbiota recruited by healthy grapevines enhance resistance against root rot disease.","authors":"Ruotong Wang, Wenyu Zhang, Zhishan He, Yao Zhou, Cheng Chen, Kaibo Song, Qingwu Shang, Yunfeng Wu, Peiwen Gu, Duntao Shu, Lei Zhao","doi":"10.1186/s13059-025-03905-y","DOIUrl":"10.1186/s13059-025-03905-y","url":null,"abstract":"<p><strong>Background: </strong>Root rot disease caused by fungal pathogens of wine grapevines poses a serious threat to their growth and results in a substantial economic impact on grape industry. The rhizosphere microbiome recruited to plants is critical for mitigating soil-borne pathogens. However, how beneficial microbes influence disease resistance remains unclear.</p><p><strong>Results: </strong>We investigate the composition and gene functions of microorganisms in wine grapevines with root rot disease and healthy controls by amplicon and metagenomic sequencing. We use culturomics and in vivo experiments to verify the pathogen and beneficial strains to improve plant health. We find that root rot disease in grapevines significantly affects rhizosphere microbiome diversity and composition. The microbial interkingdom network indicates that the disease destabilizes the bacteria-fungi co-occurrence network. We find that plants recruit the potentially beneficial bacteria Pseudomonas, Bacillus and Streptomyces in healthy rhizosphere soil. By culturomics, we confirm that Fusarium solani is the main pathogen causing root rot disease. We further observe that these three key beneficial bacteria from the co-occurrence networks enhance the resistance of grapevines to pathogens. Furthermore, metagenomic analysis reveals that beneficial bacterial strains suppress pathogens by enriching potential functional genes in pathways involved in disease resistance.</p><p><strong>Conclusions: </strong>Our findings highlight the critical role of disease resistance pathways of potentially beneficial microorganisms in fighting disease and supporting plant health, offering new insight for the exploration of beneficial microbial resources and providing a basis for the development of biological control of grape root rot disease.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"13"},"PeriodicalIF":12.3,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12857107/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145907182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1186/s13059-025-03913-y
Tessa R MacNish, Thomas Bergmann, David Edwards
There is an urgent need to increase sustainable crop production. The application of molecular marker technologies such as genomic selection and machine learning based approaches are aiding accelerated crop improvement. Conventional molecular marker technologies use single nucleotide polymorphisms to predict traits, however these do not capture local epistasis and can be challenging for machine learning applications. With the growth of genome sequence data, it is possible to define haplotypes that can account for local epistatic effects and are more suitable for machine learning models. This review discusses the different methods for defining haplotype blocks and their application in plant breeding.
{"title":"Haplotype applications in genomic selection.","authors":"Tessa R MacNish, Thomas Bergmann, David Edwards","doi":"10.1186/s13059-025-03913-y","DOIUrl":"10.1186/s13059-025-03913-y","url":null,"abstract":"<p><p>There is an urgent need to increase sustainable crop production. The application of molecular marker technologies such as genomic selection and machine learning based approaches are aiding accelerated crop improvement. Conventional molecular marker technologies use single nucleotide polymorphisms to predict traits, however these do not capture local epistasis and can be challenging for machine learning applications. With the growth of genome sequence data, it is possible to define haplotypes that can account for local epistatic effects and are more suitable for machine learning models. This review discusses the different methods for defining haplotype blocks and their application in plant breeding.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"18"},"PeriodicalIF":12.3,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145906700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1186/s13059-025-03924-9
Yicheng Huang, Enlai Guan, Shipeng Song, Dal-Hoe Koo, Monica A Schmidt, Handong Su, Chunli Chen, Jianwei Zhang
Background: Centromere function is fundamental and conserved across eukaryotes, despite highly divergent DNA sequences, even among closely related species. These regions often contain rapidly evolving repeats and retrotransposons, yet play a crucial role in chromosome segregation. Soybean, which harbors two distinct types of centromeric satellite repeats, is an ideal model for studying centromeric repeat organization and function.
Results: Here we generate the complete map of centromeric satellite repeats revealing the organizational patterns of different types of centromeric satellite repeats within centromeres. These maps are constructed using three recently available telomere-to-telomere soybean genomes. We find that certain centromeric satellite repeats exhibit chromosome-specific evolutionary trajectories and may serve distinct functional roles in centromere activity. We further analyze the potential relationship between centromere-specific histones H3 (CENH3) and centromeric satellite repeats, identifying consensus motifs associated with CENH3-binding sites. We also analyze the higher-order tandem repeats of the centromere and propose a hypothetical model of centromeric DNA replication.
Conclusions: We conclude that CentGm-1 and CentGm-4 evolve independently. The observation that completely identical CentGm-4 sequences consistently appear on the same chromosome across different soybean varieties indicates a stronger chromosome-specific preference for CentGm-4. We propose a model in which replication templates within the centromere region originate from multiple CENH3-nucleosome complexes bound to CentGm sequences. Both CentGm-1 and CentGm-4 contain similar motifs with the potential to bind CENH3 protein. The findings provide a new insight into the mechanisms behind centromere diversity and dynamics.
{"title":"Genetic diversity and architectural dynamics of soybean centromeres.","authors":"Yicheng Huang, Enlai Guan, Shipeng Song, Dal-Hoe Koo, Monica A Schmidt, Handong Su, Chunli Chen, Jianwei Zhang","doi":"10.1186/s13059-025-03924-9","DOIUrl":"10.1186/s13059-025-03924-9","url":null,"abstract":"<p><strong>Background: </strong>Centromere function is fundamental and conserved across eukaryotes, despite highly divergent DNA sequences, even among closely related species. These regions often contain rapidly evolving repeats and retrotransposons, yet play a crucial role in chromosome segregation. Soybean, which harbors two distinct types of centromeric satellite repeats, is an ideal model for studying centromeric repeat organization and function.</p><p><strong>Results: </strong>Here we generate the complete map of centromeric satellite repeats revealing the organizational patterns of different types of centromeric satellite repeats within centromeres. These maps are constructed using three recently available telomere-to-telomere soybean genomes. We find that certain centromeric satellite repeats exhibit chromosome-specific evolutionary trajectories and may serve distinct functional roles in centromere activity. We further analyze the potential relationship between centromere-specific histones H3 (CENH3) and centromeric satellite repeats, identifying consensus motifs associated with CENH3-binding sites. We also analyze the higher-order tandem repeats of the centromere and propose a hypothetical model of centromeric DNA replication.</p><p><strong>Conclusions: </strong>We conclude that CentGm-1 and CentGm-4 evolve independently. The observation that completely identical CentGm-4 sequences consistently appear on the same chromosome across different soybean varieties indicates a stronger chromosome-specific preference for CentGm-4. We propose a model in which replication templates within the centromere region originate from multiple CENH3-nucleosome complexes bound to CentGm sequences. Both CentGm-1 and CentGm-4 contain similar motifs with the potential to bind CENH3 protein. The findings provide a new insight into the mechanisms behind centromere diversity and dynamics.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"17"},"PeriodicalIF":12.3,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145907212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding human blood metabolites is essential for deciphering systemic physiology and disease mechanisms, yet remains challenging due to diverse origins and dynamic regulation. In this study, we develop HUBMet ( https://hubmet.app.bio-it.tech/home ), an open-access web server that includes 3,950 metabolites and 129,814 metabolite-protein associations, with four analytical modules: Over-Representation Analysis (ORA) for enrichment analysis; Metabolite Set Enrichment Analysis (MSEA) for quantitative data analysis; Tissue Specificity Analysis (TSA) for assessing metabolite-tissue relevance; Metabolite-Protein Network Analysis (MPNet) for identifying key metabolite-protein associations and functional modules. HUBMet's utility is demonstrated through a COVID-19 case study revealing metabolic signatures associated with disease severity.
{"title":"HUBMet: an integrative database and analytical platform for human blood metabolites and metabolite-protein associations.","authors":"Xingyue Wang, Xiangyu Qiao, Alberto Zenere, Swapnali Barde, Jing Wang, Wen Zhong","doi":"10.1186/s13059-025-03922-x","DOIUrl":"10.1186/s13059-025-03922-x","url":null,"abstract":"<p><p>Understanding human blood metabolites is essential for deciphering systemic physiology and disease mechanisms, yet remains challenging due to diverse origins and dynamic regulation. In this study, we develop HUBMet ( https://hubmet.app.bio-it.tech/home ), an open-access web server that includes 3,950 metabolites and 129,814 metabolite-protein associations, with four analytical modules: Over-Representation Analysis (ORA) for enrichment analysis; Metabolite Set Enrichment Analysis (MSEA) for quantitative data analysis; Tissue Specificity Analysis (TSA) for assessing metabolite-tissue relevance; Metabolite-Protein Network Analysis (MPNet) for identifying key metabolite-protein associations and functional modules. HUBMet's utility is demonstrated through a COVID-19 case study revealing metabolic signatures associated with disease severity.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":" ","pages":"7"},"PeriodicalIF":12.3,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}