首页 > 最新文献

Genomics, proteomics & bioinformatics最新文献

英文 中文
Benchmark and Evaluation for Somatic Structural Variants Detection with Long-read Sequencing Data. 利用长读序列数据检测体细胞结构变异的基准与评价。
IF 7.9 Pub Date : 2025-12-31 DOI: 10.1093/gpbjnl/qzaf139
Ziting Feng, Xuyan Liu, Yahui Liu, Kailing Tu, Lin Xia, Dan Xie

Somatic structural variations (somatic SVs) are hallmarks of tumors, but their comprehensive detection remains technically challenging. Long-read sequencing (LRS) technology, which generates reads spanning large-scale SVs and their flanking sequences, enables a wide range of prospects for somatic SV detection. However, existing LRS-based somatic SV detection algorithms and pipelines exhibit variable performance that has not been systematically characterized. In this study, we conducted a rigorous evaluation of 51 LRS-based somatic SV detection strategies, integrating 3 reference genomes, 2 aligners, 5 SV callers, and 5 processing methods tailored for SV callers. We use both simulated datasets and empirical data from HCC1395/HCC1395BL cell lines sequenced on Oxford Nanopore (ONT) and Pacific Biosciences (PacBio) platforms for technical assessment. Our findings highlight the need for further refinement of specialized somatic SV detection tools, as no single strategy consistently outperforms across all scenarios. Workflows based on germline SV callers exhibit a high false-positive rate, which cannot be mitigated by increasing sequencing depth or tumor purity. Furthermore, challenges persist in detecting insertions, genomic tandem repeat regions, and ultra-long SVs. We delineate technical bottlenecks in current somatic SV detection approaches and provide recommendations for their further advancement. Additionally, we offer suggestions for selecting specific tools in different application scenarios. This work offers a comprehensive benchmark for somatic SV detection and valuable insights for future LRS-based tools development and methodological improvements.

体细胞结构变异(体细胞SVs)是肿瘤的标志,但其综合检测在技术上仍然具有挑战性。长读测序(LRS)技术可以生成跨越大尺度SV及其侧翼序列的reads,为体细胞SV检测提供了广阔的前景。然而,现有的基于lrs的躯体SV检测算法和管道表现出不同的性能,尚未得到系统的表征。在这项研究中,我们对51种基于lrs的体细胞SV检测策略进行了严格的评估,整合了3个参考基因组、2个比对者、5个SV呼叫者和5种针对SV呼叫者的处理方法。我们使用模拟数据集和在Oxford Nanopore (ONT)和Pacific Biosciences (PacBio)平台上测序的HCC1395/HCC1395BL细胞系的经验数据进行技术评估。我们的研究结果强调了进一步改进专门的躯体SV检测工具的必要性,因为没有一种策略在所有情况下都能始终表现出色。基于种系SV调用者的工作流程表现出很高的假阳性率,不能通过增加测序深度或肿瘤纯度来减轻。此外,在检测插入、基因组串联重复区域和超长sv方面仍然存在挑战。我们描述了当前体细胞SV检测方法的技术瓶颈,并为其进一步发展提供了建议。此外,我们还提供了在不同应用场景中选择特定工具的建议。这项工作为躯体SV检测提供了全面的基准,并为未来基于lrs的工具开发和方法改进提供了有价值的见解。
{"title":"Benchmark and Evaluation for Somatic Structural Variants Detection with Long-read Sequencing Data.","authors":"Ziting Feng, Xuyan Liu, Yahui Liu, Kailing Tu, Lin Xia, Dan Xie","doi":"10.1093/gpbjnl/qzaf139","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf139","url":null,"abstract":"<p><p>Somatic structural variations (somatic SVs) are hallmarks of tumors, but their comprehensive detection remains technically challenging. Long-read sequencing (LRS) technology, which generates reads spanning large-scale SVs and their flanking sequences, enables a wide range of prospects for somatic SV detection. However, existing LRS-based somatic SV detection algorithms and pipelines exhibit variable performance that has not been systematically characterized. In this study, we conducted a rigorous evaluation of 51 LRS-based somatic SV detection strategies, integrating 3 reference genomes, 2 aligners, 5 SV callers, and 5 processing methods tailored for SV callers. We use both simulated datasets and empirical data from HCC1395/HCC1395BL cell lines sequenced on Oxford Nanopore (ONT) and Pacific Biosciences (PacBio) platforms for technical assessment. Our findings highlight the need for further refinement of specialized somatic SV detection tools, as no single strategy consistently outperforms across all scenarios. Workflows based on germline SV callers exhibit a high false-positive rate, which cannot be mitigated by increasing sequencing depth or tumor purity. Furthermore, challenges persist in detecting insertions, genomic tandem repeat regions, and ultra-long SVs. We delineate technical bottlenecks in current somatic SV detection approaches and provide recommendations for their further advancement. Additionally, we offer suggestions for selecting specific tools in different application scenarios. This work offers a comprehensive benchmark for somatic SV detection and valuable insights for future LRS-based tools development and methodological improvements.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REC8-Cohesin Preferentially Localizes to Promoters of Genes that are Regulated by Transcription Suppressor BEND2 During Early Meiosis. 在早期减数分裂中,rec8 -内聚蛋白优先定位于受转录抑制因子BEND2调控的基因启动子。
IF 7.9 Pub Date : 2025-12-31 DOI: 10.1093/gpbjnl/qzaf138
Dan Xie, Longfei Ma, Jing Sun, Hengyu Nie, Lin Yan, Yalin Xue, Jian Chen, Shuguang Duo, Chunsheng Han

Cohesin plays critical roles in chromatin organization and transcription regulation. REC8 is a meiosis-specific cohesin subunit and is essential for homologous chromosome synapsis, recombination, and segregation. However, little is known about the relationship between the dynamic genome-wide distribution of cohesin and transcription regulation during meiotic initiation. In this study, we report that REC8-cohesin is preferentially localized to open promoter regions of genes involved in spermatogonial differentiation and meiosis at early meiosis from preleptonema to zygonema. Genomic localization of REC8-cohesin is changed by the gene knockout of the transcriptional suppressor BEND2. We also find that REC8 is able to interact with mitotic cyclin CCNA2, that the CCNA2 expression is extended to leptonema in Bend2 knockout mice, and that the meiotic cells of Bend2 knockout mice do not exit the mitotic cell cycle completely. We further found that a large number of genes are commonly bound by BEND2, STRA8, MEIOSIN, and REC8-cohesin. Our study has therefore revealed that genes with open promoters are bound by meiotic cohesin and transcription factors coordinately to facilitate chromatin reorganization and transcription regulation leading to the switch from a mitotic cell cycle to a meiotic one at the initiation stage of meiosis.

内聚蛋白在染色质组织和转录调控中起着关键作用。REC8是减数分裂特异性内聚蛋白亚基,对同源染色体突触、重组和分离至关重要。然而,在减数分裂起始过程中,内聚蛋白的全基因组动态分布与转录调控之间的关系尚不清楚。在这项研究中,我们报道了rec8 -粘接蛋白在轻体前体到颧肿的早期减数分裂中优先定位于参与精原细胞分化和减数分裂的基因的开放启动子区域。基因敲除转录抑制因子BEND2会改变rec8 -内聚蛋白的基因组定位。我们还发现REC8能够与有丝分裂周期蛋白CCNA2相互作用,在Bend2敲除小鼠中CCNA2的表达扩展到瘦素体,并且Bend2敲除小鼠的减数分裂细胞不完全退出有丝分裂细胞周期。我们进一步发现,大量基因通常与BEND2、STRA8、MEIOSIN和rec8黏结蛋白结合。因此,我们的研究表明,具有开放启动子的基因与减数分裂内聚蛋白和转录因子协调结合,促进染色质重组和转录调节,从而在减数分裂起始阶段从有丝分裂细胞周期切换到减数分裂周期。
{"title":"REC8-Cohesin Preferentially Localizes to Promoters of Genes that are Regulated by Transcription Suppressor BEND2 During Early Meiosis.","authors":"Dan Xie, Longfei Ma, Jing Sun, Hengyu Nie, Lin Yan, Yalin Xue, Jian Chen, Shuguang Duo, Chunsheng Han","doi":"10.1093/gpbjnl/qzaf138","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf138","url":null,"abstract":"<p><p>Cohesin plays critical roles in chromatin organization and transcription regulation. REC8 is a meiosis-specific cohesin subunit and is essential for homologous chromosome synapsis, recombination, and segregation. However, little is known about the relationship between the dynamic genome-wide distribution of cohesin and transcription regulation during meiotic initiation. In this study, we report that REC8-cohesin is preferentially localized to open promoter regions of genes involved in spermatogonial differentiation and meiosis at early meiosis from preleptonema to zygonema. Genomic localization of REC8-cohesin is changed by the gene knockout of the transcriptional suppressor BEND2. We also find that REC8 is able to interact with mitotic cyclin CCNA2, that the CCNA2 expression is extended to leptonema in Bend2 knockout mice, and that the meiotic cells of Bend2 knockout mice do not exit the mitotic cell cycle completely. We further found that a large number of genes are commonly bound by BEND2, STRA8, MEIOSIN, and REC8-cohesin. Our study has therefore revealed that genes with open promoters are bound by meiotic cohesin and transcription factors coordinately to facilitate chromatin reorganization and transcription regulation leading to the switch from a mitotic cell cycle to a meiotic one at the initiation stage of meiosis.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iRUNNER: A Baseline Mutation Burden Regression for Identifying Gene Interaction Between Rare Variants for Diseases. 赛跑者:用于识别罕见疾病变异之间基因相互作用的基线突变负担回归。
IF 7.9 Pub Date : 2025-12-30 DOI: 10.1093/gpbjnl/qzaf135
Hui Jiang, Bin Tang, Kun Li, Liubin Zhang, Junhao Liang, Clara Sze-Man Tang, Paul Kwong-Hang Tam, Binbin Wang, Youqiang Song, Qiang Wang, Mulin Jun Li, Hailiang Huang, Miaoxin Li

Genetic interactions play a crucial role in elucidating the susceptibility and etiology of complex multifactorial diseases. Despite significant efforts to identify disease-associated nonlinear effects in genome-wide association studies, efficient methods for detecting the epistatic impact of rare variants remain lacking. In this study, we proposed iRUNNER, a novel and powerful mutation burden test focused on analyzing the interaction effects of rare variants on a binary trait. Different from conventional association tests comparing cases with controls, iRUNNER evaluates the relative enrichment of rare variant interaction burden of pairwise genes in patients against its baseline, estimated by a recursive truncated negative-binomial regression model that leverages multiple genomic features from public databases. Extensive simulations demonstrated that iRUNNER outperforms existing epistasis tests in statistical power and maintains reasonable type I error rates even when population stratification exists in control samples. Applied to real datasets of five complex diseases, iRUNNER yielded substantial gains in gene-gene interaction detections. Notably, the majority of these signals were missed by alternative methods, especially in small to medium-sized samples. Furthermore, we found that these identified gene pairs of each trait can form interconnected networks, which may provide valuable insights into the underlying molecular mechanisms. We have implemented iRUNNER as a module in our integrative platform KGGSeq (http://pmglab.top/kggseq/) that enables rapid testing of pairwise interactions among all possible non-synonymous rare coding variants within hours.

遗传相互作用在阐明复杂多因子疾病的易感性和病因学方面起着至关重要的作用。尽管在全基因组关联研究中为识别疾病相关的非线性效应做出了重大努力,但仍然缺乏检测罕见变异上位性影响的有效方法。在这项研究中,我们提出了一种新颖而强大的突变负担测试irrunner,专注于分析罕见变异对二元性状的相互作用效应。与将病例与对照组进行比较的传统关联试验不同,irrunner通过利用公共数据库中的多个基因组特征的递归截断负二项回归模型,根据其基线评估患者中罕见变异相互作用负担的相对富集程度。大量的模拟表明,即使在控制样本中存在人口分层,runner在统计能力上优于现有的上位性测试,并保持合理的I型错误率。应用于五种复杂疾病的真实数据集,runner在基因-基因相互作用检测方面取得了实质性进展。值得注意的是,替代方法遗漏了大多数这些信号,特别是在中小型样本中。此外,我们发现这些鉴定出的每个性状的基因对可以形成相互关联的网络,这可能为潜在的分子机制提供有价值的见解。我们已经在我们的集成平台KGGSeq (http://pmglab.top/kggseq/)中实现了irrunner作为模块,可以在数小时内快速测试所有可能的非同义罕见编码变体之间的成对相互作用。
{"title":"iRUNNER: A Baseline Mutation Burden Regression for Identifying Gene Interaction Between Rare Variants for Diseases.","authors":"Hui Jiang, Bin Tang, Kun Li, Liubin Zhang, Junhao Liang, Clara Sze-Man Tang, Paul Kwong-Hang Tam, Binbin Wang, Youqiang Song, Qiang Wang, Mulin Jun Li, Hailiang Huang, Miaoxin Li","doi":"10.1093/gpbjnl/qzaf135","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf135","url":null,"abstract":"<p><p>Genetic interactions play a crucial role in elucidating the susceptibility and etiology of complex multifactorial diseases. Despite significant efforts to identify disease-associated nonlinear effects in genome-wide association studies, efficient methods for detecting the epistatic impact of rare variants remain lacking. In this study, we proposed iRUNNER, a novel and powerful mutation burden test focused on analyzing the interaction effects of rare variants on a binary trait. Different from conventional association tests comparing cases with controls, iRUNNER evaluates the relative enrichment of rare variant interaction burden of pairwise genes in patients against its baseline, estimated by a recursive truncated negative-binomial regression model that leverages multiple genomic features from public databases. Extensive simulations demonstrated that iRUNNER outperforms existing epistasis tests in statistical power and maintains reasonable type I error rates even when population stratification exists in control samples. Applied to real datasets of five complex diseases, iRUNNER yielded substantial gains in gene-gene interaction detections. Notably, the majority of these signals were missed by alternative methods, especially in small to medium-sized samples. Furthermore, we found that these identified gene pairs of each trait can form interconnected networks, which may provide valuable insights into the underlying molecular mechanisms. We have implemented iRUNNER as a module in our integrative platform KGGSeq (http://pmglab.top/kggseq/) that enables rapid testing of pairwise interactions among all possible non-synonymous rare coding variants within hours.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read Sequencing Reveals Repeat Expansions and Large Structural Variants in Oral Squamous Cell Carcinoma. 长读序列揭示了口腔鳞状细胞癌的重复扩增和大结构变异。
IF 7.9 Pub Date : 2025-12-27 DOI: 10.1093/gpbjnl/qzaf133
Li Hu, Jiaxun Zhang, Zhuoyuan Zhang, Ranlei Wei, Jie Fu, Xiaoxue Tang, Xuyan Liu, Lanfang Yuan, Ziting Feng, Sibo Wu, Lin Xia, Dan Xie

Previous genomic studies have predominantly analyzed oral squamous cell carcinoma (OSCC) in conjunction with other head and neck squamous cell carcinomas (HNSCC), constraining our comprehension of OSCC-specific structural variants (SVs). Here, we performed long-read whole-genome sequencing on 16 paired OSCC tumor and blood samples to elucidate the biological functions of somatic SVs. We identified a total of 5775 high-confidence somatic SVs, including five recurrent simple repeat expansions (SREs). Notably, one SRE located within the promoter region of the OBI1 gene is present in 45% of OSCC samples. Knocking out this SRE in the HSC4 cell line significantly reduces the expression of OBI1, resulting in decreased proliferative and migratory capacities compared to wild-type cells. Furthermore, we found that the frequently amplified region 11q13 in HNSCC is prone to large-scale somatic SVs, affecting the expression of ANO1, FADD, and CTTN, thereby confirming the association of SVs in this region with OSCC development. Our study provides novel insights into the role of somatic SVs in OSCC, especially with respect to SREs and large-scale SVs in critical genomic regions, thereby enhancing our comprehension of the molecular pathogenesis of OSCC.

先前的基因组研究主要分析了口腔鳞状细胞癌(OSCC)与其他头颈部鳞状细胞癌(HNSCC),限制了我们对OSCC特异性结构变异(SVs)的理解。在这里,我们对16对OSCC肿瘤和血液样本进行了长读全基因组测序,以阐明体细胞SVs的生物学功能。我们共鉴定了5775例高置信度体细胞sv,包括5例复发性简单重复扩增(SREs)。值得注意的是,在45%的OSCC样本中存在一个位于OBI1基因启动子区域的SRE。在HSC4细胞系中敲除该SRE可显著降低OBI1的表达,导致与野生型细胞相比,增殖和迁移能力下降。此外,我们发现HNSCC中频繁扩增的11q13区容易发生大规模体细胞SVs,影响ANO1、FADD和CTTN的表达,从而证实该区域的SVs与OSCC的发生有关。我们的研究为体细胞SVs在OSCC中的作用提供了新的见解,特别是在关键基因组区域的SREs和大规模SVs方面,从而增强了我们对OSCC分子发病机制的理解。
{"title":"Long-read Sequencing Reveals Repeat Expansions and Large Structural Variants in Oral Squamous Cell Carcinoma.","authors":"Li Hu, Jiaxun Zhang, Zhuoyuan Zhang, Ranlei Wei, Jie Fu, Xiaoxue Tang, Xuyan Liu, Lanfang Yuan, Ziting Feng, Sibo Wu, Lin Xia, Dan Xie","doi":"10.1093/gpbjnl/qzaf133","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf133","url":null,"abstract":"<p><p>Previous genomic studies have predominantly analyzed oral squamous cell carcinoma (OSCC) in conjunction with other head and neck squamous cell carcinomas (HNSCC), constraining our comprehension of OSCC-specific structural variants (SVs). Here, we performed long-read whole-genome sequencing on 16 paired OSCC tumor and blood samples to elucidate the biological functions of somatic SVs. We identified a total of 5775 high-confidence somatic SVs, including five recurrent simple repeat expansions (SREs). Notably, one SRE located within the promoter region of the OBI1 gene is present in 45% of OSCC samples. Knocking out this SRE in the HSC4 cell line significantly reduces the expression of OBI1, resulting in decreased proliferative and migratory capacities compared to wild-type cells. Furthermore, we found that the frequently amplified region 11q13 in HNSCC is prone to large-scale somatic SVs, affecting the expression of ANO1, FADD, and CTTN, thereby confirming the association of SVs in this region with OSCC development. Our study provides novel insights into the role of somatic SVs in OSCC, especially with respect to SREs and large-scale SVs in critical genomic regions, thereby enhancing our comprehension of the molecular pathogenesis of OSCC.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145844460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiomics Analysis Reveals How Intratumoral Bacteria Shape the Immune Microenvironment in Gastric Cancer. 多组学分析揭示肿瘤内细菌如何塑造胃癌的免疫微环境。
IF 7.9 Pub Date : 2025-12-27 DOI: 10.1093/gpbjnl/qzaf132
Yang Mi, Die Dai, Xia Xue, Haiming Qin, Feifei Ren, Barry J Marshall, Alfred Tay, Ihtisham Bukhari, Xiaojie Li, Shaogong Zhu, Yong Yu, Wanqing Wu, Yan Tan, Youcai Tang, Xin Xie, Haiqing Bai, Xiaochen Yin, Pengyuan Zheng

The occurrence and progression of gastric cancer (GC) are closely associated with dysbiosis of the gastric microbiota and alteration in host microenvironments. However, the interaction between intratumoral bacteria and gastric microenvironments remains incompletely understood. In this study, we characterized the biological profiles of intratumoral bacteria, metabolome, and proteome in 20 GC tumors and paired non-tumor tissues, in combination with six independent datasets (comprising 497 gastric tissue biopsies and 554 normal tissues), as well as mucosal tissues from 10 individuals without GC. We found that the diversity and richness of gastric microbiota were significantly higher in tumor tissues than in non-tumor tissues. In contrast, the lowest biodiversity, at both the genus and species levels, was found in the microbiota of individuals without GC. Specifically, tumors were enriched with Bacteroides thetaiotaomicron, Lactobacillus parabrevis, Brevundimonas nasdae, and Brevundimonas vesicularis. We also identified 39 human immunity-related proteins, particularly in the tryptophan metabolic pathway, which were differentially expressed across various microenvironments (tumor and non-tumor). Furthermore, we found that several pathways involved in the human immune system and associated with the gastric microbiota, such as thiazole biosynthesis II, pyrimidine deoxyribonucleoside salvage, superpathway of pyrimidine deoxyribonucleoside salvage, and superpathway of heme biosynthesis from uroporphyrinogen-III, hold potential as biomarkers for early detection of GC. Our results provide a comprehensive framework for investigating the complex interactions between the tumor immune microenvironment and intratumoral bacterial community.

胃癌的发生和发展与胃菌群失调和宿主微环境改变密切相关。然而,肿瘤内细菌与胃微环境之间的相互作用仍不完全清楚。在这项研究中,我们结合6个独立的数据集(包括497个胃组织活检和554个正常组织)以及10个无胃癌个体的粘膜组织,对20个胃癌肿瘤和配对的非肿瘤组织的瘤内细菌、代谢组和蛋白质组的生物学特征进行了表征。我们发现肿瘤组织中胃微生物群的多样性和丰富度明显高于非肿瘤组织。相比之下,在属和种水平上,没有GC的个体的微生物群的生物多样性最低。具体来说,肿瘤中富集了拟杆菌、副短乳杆菌、鼻短单胞菌和囊状短单胞菌。我们还鉴定了39种人类免疫相关蛋白,特别是在色氨酸代谢途径中,它们在不同微环境(肿瘤和非肿瘤)中差异表达。此外,我们发现人体免疫系统中与胃微生物群相关的几种途径,如噻唑生物合成II、嘧啶脱氧核糖核苷回收、嘧啶脱氧核糖核苷回收超途径和尿卟啉原- iii血红素生物合成超途径,具有作为GC早期检测的生物标志物的潜力。我们的研究结果为研究肿瘤免疫微环境和肿瘤内细菌群落之间复杂的相互作用提供了一个全面的框架。
{"title":"Multiomics Analysis Reveals How Intratumoral Bacteria Shape the Immune Microenvironment in Gastric Cancer.","authors":"Yang Mi, Die Dai, Xia Xue, Haiming Qin, Feifei Ren, Barry J Marshall, Alfred Tay, Ihtisham Bukhari, Xiaojie Li, Shaogong Zhu, Yong Yu, Wanqing Wu, Yan Tan, Youcai Tang, Xin Xie, Haiqing Bai, Xiaochen Yin, Pengyuan Zheng","doi":"10.1093/gpbjnl/qzaf132","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf132","url":null,"abstract":"<p><p>The occurrence and progression of gastric cancer (GC) are closely associated with dysbiosis of the gastric microbiota and alteration in host microenvironments. However, the interaction between intratumoral bacteria and gastric microenvironments remains incompletely understood. In this study, we characterized the biological profiles of intratumoral bacteria, metabolome, and proteome in 20 GC tumors and paired non-tumor tissues, in combination with six independent datasets (comprising 497 gastric tissue biopsies and 554 normal tissues), as well as mucosal tissues from 10 individuals without GC. We found that the diversity and richness of gastric microbiota were significantly higher in tumor tissues than in non-tumor tissues. In contrast, the lowest biodiversity, at both the genus and species levels, was found in the microbiota of individuals without GC. Specifically, tumors were enriched with Bacteroides thetaiotaomicron, Lactobacillus parabrevis, Brevundimonas nasdae, and Brevundimonas vesicularis. We also identified 39 human immunity-related proteins, particularly in the tryptophan metabolic pathway, which were differentially expressed across various microenvironments (tumor and non-tumor). Furthermore, we found that several pathways involved in the human immune system and associated with the gastric microbiota, such as thiazole biosynthesis II, pyrimidine deoxyribonucleoside salvage, superpathway of pyrimidine deoxyribonucleoside salvage, and superpathway of heme biosynthesis from uroporphyrinogen-III, hold potential as biomarkers for early detection of GC. Our results provide a comprehensive framework for investigating the complex interactions between the tumor immune microenvironment and intratumoral bacterial community.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145844464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling Tissue Structure and Tumor Microenvironment from Spatial Omics by Hypergraph Learning. 利用超图学习揭示空间组学中的组织结构和肿瘤微环境。
IF 7.9 Pub Date : 2025-12-26 DOI: 10.1093/gpbjnl/qzaf128
Yi Liao, Chong Zhang, Zhikang Wang, Fei Qi, Weitian Huang, Shangyan Cai, Junyu Li, Jiazhou Chen, Robin B Gasser, Zhiyuan Yuan, Jiangning Song, Hongmin Cai

Spatial omics technologies have revolutionized life sciences by enabling the simultaneous acquisition of biomolecular and spatial information. Identifying spatial patterns is crucial for understanding organ development and tumor microenvironments. However, the emergence of diverse spatial omics resolutions in these technologies has made it challenging to accurately characterize spatial domains at finer resolutions. To address this, we propose HyperSTAR, a hypergraph-based method designed to precisely identify spatial domains across varying resolutions by leveraging higher-order relationships among spatially adjacent tissue programs. Specifically, a gene expression-guided hyperedge decomposition module is introduced to refine the hypergraph structure to accurately delineate spatial domains boundaries. Additionally, a hypergraph attention convolutional neural network is designed to adaptively learn the importance of each hyperedge, enhancing the model's ability to capture complex higher-order relationships within spatially neighboring multi-spots and/or single cells. HyperSTAR outperforms existing graph neural network models in tasks such as uncovering tissue substructures, inferring spatiotemporal patterns, and denoising spatially resolved gene expressions. It effectively handles diverse spatial omics data types and scales seamlessly to large datasets. The method successfully reveals spatial heterogeneity in breast cancer sections, with findings validated through functional and survival analyses of independent clinical data. HyperSTAR represents a significant advancement in spatial omics analysis, representing a robust tool for exploring complex spatial patterns across varying resolutions and data types. Its ability to capture intricate higher-order relationships among spatially neighboring spots/cells makes it an invaluable tool for advancing research in life sciences, particularly in cancer and developmental biology. The toolbox is available at https://github.com/Ringoio/HyperSTAR.

空间组学技术通过同时获取生物分子和空间信息,彻底改变了生命科学。识别空间模式对于理解器官发育和肿瘤微环境至关重要。然而,在这些技术中出现了不同的空间组学分辨率,这使得在更精细的分辨率下准确表征空间域具有挑战性。为了解决这个问题,我们提出了HyperSTAR,这是一种基于超图的方法,旨在通过利用空间相邻组织程序之间的高阶关系来精确识别不同分辨率的空间域。具体来说,引入了一个基因表达导向的超边缘分解模块来细化超图结构,以准确地描绘空间域边界。此外,设计了一个超图注意卷积神经网络,用于自适应学习每个超边缘的重要性,增强模型捕捉空间相邻多点和/或单个细胞内复杂高阶关系的能力。HyperSTAR在揭示组织亚结构、推断时空模式和去噪空间分辨基因表达等任务上优于现有的图神经网络模型。它有效地处理不同的空间组学数据类型,并无缝地扩展到大型数据集。该方法成功地揭示了乳腺癌切片的空间异质性,并通过独立临床数据的功能和生存分析验证了结果。HyperSTAR代表了空间组学分析的重大进步,代表了在不同分辨率和数据类型中探索复杂空间模式的强大工具。它能够捕捉空间相邻点/细胞之间复杂的高阶关系,这使它成为推进生命科学研究的宝贵工具,特别是在癌症和发育生物学方面。该工具箱可在https://github.com/Ringoio/HyperSTAR上获得。
{"title":"Unveiling Tissue Structure and Tumor Microenvironment from Spatial Omics by Hypergraph Learning.","authors":"Yi Liao, Chong Zhang, Zhikang Wang, Fei Qi, Weitian Huang, Shangyan Cai, Junyu Li, Jiazhou Chen, Robin B Gasser, Zhiyuan Yuan, Jiangning Song, Hongmin Cai","doi":"10.1093/gpbjnl/qzaf128","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf128","url":null,"abstract":"<p><p>Spatial omics technologies have revolutionized life sciences by enabling the simultaneous acquisition of biomolecular and spatial information. Identifying spatial patterns is crucial for understanding organ development and tumor microenvironments. However, the emergence of diverse spatial omics resolutions in these technologies has made it challenging to accurately characterize spatial domains at finer resolutions. To address this, we propose HyperSTAR, a hypergraph-based method designed to precisely identify spatial domains across varying resolutions by leveraging higher-order relationships among spatially adjacent tissue programs. Specifically, a gene expression-guided hyperedge decomposition module is introduced to refine the hypergraph structure to accurately delineate spatial domains boundaries. Additionally, a hypergraph attention convolutional neural network is designed to adaptively learn the importance of each hyperedge, enhancing the model's ability to capture complex higher-order relationships within spatially neighboring multi-spots and/or single cells. HyperSTAR outperforms existing graph neural network models in tasks such as uncovering tissue substructures, inferring spatiotemporal patterns, and denoising spatially resolved gene expressions. It effectively handles diverse spatial omics data types and scales seamlessly to large datasets. The method successfully reveals spatial heterogeneity in breast cancer sections, with findings validated through functional and survival analyses of independent clinical data. HyperSTAR represents a significant advancement in spatial omics analysis, representing a robust tool for exploring complex spatial patterns across varying resolutions and data types. Its ability to capture intricate higher-order relationships among spatially neighboring spots/cells makes it an invaluable tool for advancing research in life sciences, particularly in cancer and developmental biology. The toolbox is available at https://github.com/Ringoio/HyperSTAR.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145844455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MPCutter: Predicting Protease-specific Substrate Cleavage Sites Using a Protein Language Model. MPCutter:使用蛋白质语言模型预测蛋白酶特异性底物切割位点。
IF 7.9 Pub Date : 2025-12-23 DOI: 10.1093/gpbjnl/qzaf130
Zhe Wang, Tuoyu Liu, Guoshun Xu, Han Gao, Ruohan Zhang, Honglian Zhang, Guijie Zhang, Ningfeng Wu, Bin Yao, Huiying Luo, Feifei Guan, Jian Tian

Proteases can cleave peptide bonds of target substrate proteins. Their controlled proteolysis is vital for protein degradation, recycling, and physiological processes. Understanding the hydrolytic mechanisms of proteases is crucial, particularly for identifying their specific substrates and cleavage sites. Bioinformatics approaches can predict novel protease-substrate cleavage events with high accuracy using sequence and structural information. However, existing tools for cleavage site prediction face several limitations, including restricted accuracy due to limited data and cumbersome training processes that impede timely updates. To address these challenges, we developed MPCutter, which was created by fine-tuning a general-purpose protein sequence language model. This method combined the extensive knowledge of the general model with the targeted optimization of fine-tuning, providing a powerful tool for protease-substrate cleavage prediction. MPCutter offers optimized cleavage site prediction models with enhanced performance and broader coverage across proteases, encompassing four major protease families including 62 distinct proteases. Benchmarking experiments using independent test datasets demonstrated that MPCutter outperformed existing generic tools. In our case study and experiments, MPCutter precisely recognized the majority of cleavage sites and validated five caspase-3 cleavage sites crucial for cellular physiology. Notably, its application to the 10,260-protein human proteome and specific cancer pathways revealed potential new target substrates and provided insights into key biochemical behaviors of proteases. MPCutter is expected to serve as a powerful tool for high-throughput prediction of protease-specific substrates and to facilitate hypothesis-driven exploration of protease proteolytic events. The MPCutter code and associated data are freely available at https://github.com/2053798680wang/MPCutter.git.

蛋白酶可以切割目标底物蛋白的肽键。它们控制的蛋白质水解对蛋白质降解、再循环和生理过程至关重要。了解蛋白酶的水解机制是至关重要的,特别是确定其特定的底物和裂解位点。生物信息学方法可以利用序列和结构信息高精度地预测新的蛋白酶-底物裂解事件。然而,现有的解理位点预测工具面临着一些局限性,包括由于有限的数据和繁琐的训练过程而导致的准确性限制,阻碍了及时更新。为了解决这些问题,我们开发了MPCutter,它是通过对通用蛋白质序列语言模型进行微调而创建的。该方法将广泛的通用模型知识与有针对性的微调优化相结合,为蛋白酶-底物切割预测提供了强大的工具。MPCutter提供优化的切割位点预测模型,具有更强的性能和更广泛的蛋白酶覆盖范围,包括4个主要蛋白酶家族,包括62种不同的蛋白酶。使用独立测试数据集的基准测试实验表明,MPCutter优于现有的通用工具。在我们的案例研究和实验中,MPCutter精确识别了大多数切割位点,并验证了五个对细胞生理至关重要的caspase-3切割位点。值得注意的是,将其应用于10,260蛋白的人类蛋白质组和特定的癌症途径,揭示了潜在的新靶标底物,并为蛋白酶的关键生化行为提供了见解。MPCutter有望成为一种强大的工具,用于高通量预测蛋白酶特异性底物,并促进蛋白酶蛋白水解事件的假设驱动探索。MPCutter代码和相关数据可在https://github.com/2053798680wang/MPCutter.git免费获得。
{"title":"MPCutter: Predicting Protease-specific Substrate Cleavage Sites Using a Protein Language Model.","authors":"Zhe Wang, Tuoyu Liu, Guoshun Xu, Han Gao, Ruohan Zhang, Honglian Zhang, Guijie Zhang, Ningfeng Wu, Bin Yao, Huiying Luo, Feifei Guan, Jian Tian","doi":"10.1093/gpbjnl/qzaf130","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf130","url":null,"abstract":"<p><p>Proteases can cleave peptide bonds of target substrate proteins. Their controlled proteolysis is vital for protein degradation, recycling, and physiological processes. Understanding the hydrolytic mechanisms of proteases is crucial, particularly for identifying their specific substrates and cleavage sites. Bioinformatics approaches can predict novel protease-substrate cleavage events with high accuracy using sequence and structural information. However, existing tools for cleavage site prediction face several limitations, including restricted accuracy due to limited data and cumbersome training processes that impede timely updates. To address these challenges, we developed MPCutter, which was created by fine-tuning a general-purpose protein sequence language model. This method combined the extensive knowledge of the general model with the targeted optimization of fine-tuning, providing a powerful tool for protease-substrate cleavage prediction. MPCutter offers optimized cleavage site prediction models with enhanced performance and broader coverage across proteases, encompassing four major protease families including 62 distinct proteases. Benchmarking experiments using independent test datasets demonstrated that MPCutter outperformed existing generic tools. In our case study and experiments, MPCutter precisely recognized the majority of cleavage sites and validated five caspase-3 cleavage sites crucial for cellular physiology. Notably, its application to the 10,260-protein human proteome and specific cancer pathways revealed potential new target substrates and provided insights into key biochemical behaviors of proteases. MPCutter is expected to serve as a powerful tool for high-throughput prediction of protease-specific substrates and to facilitate hypothesis-driven exploration of protease proteolytic events. The MPCutter code and associated data are freely available at https://github.com/2053798680wang/MPCutter.git.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Indicine X-linked CYBBL237M Mutation Can Suppress Intracellular Infection with Tubercle Bacilli. x连锁CYBBL237M突变抑制结核杆菌细胞内感染
IF 7.9 Pub Date : 2025-12-23 DOI: 10.1093/gpbjnl/qzaf131
Haoxin Wang, Xiaoting Xia, Lulan Zeng, Jing Yang, Jing Han, David E MacHugh, Johannes A Lenstra, Yanliang Song, Ajiao Fan, Yifan Zhu, Zhenliang Zhu, Xinyan Zhang, Yingyu Chen, Jianlin Han, Chuzhao Lei, Ningbo Chen, Yong Zhang, Yuanpeng Gao

Indicine cattle exhibit superior resistance to Mycobacterium bovis infection compared to taurine breeds, revealing divergent genetic mechanisms underlying bovine tuberculosis (bTB) resilience. Previous research has demonstrated that Cytochrome b-245 (CYBB) gene variants are associated with Mendelian susceptibility to Mycobacterium tuberculosis complex (MTBC) infections. In this study, we analyzed the X-chromosomal sequences from 258 female cattle and identified a divergent missense variant (L237M) in the CYBB gene. This variant occurs at high frequencies in indicine populations. Functional studies using murine macrophages revealed that CYBB  L237M mitigates M. tuberculosis-induced ferroptosis by elevating glutathione synthesis and glutathione peroxidase 4 expression. Mechanistically, the L237M substitution enhances the stability of the nicotinamide adenine dinucleotide phosphate (NADPH) oxidase 2 (NOX2) and p22phox complex (NOX2-p22), which is critical for the generation of phagosomal reactive oxygen species and bacterial clearance. Our findings demonstrate that CYBB  L237M promotes intracellular MTBC elimination through ferroptosis suppression, partially explaining the superior bTB resistance of indicine cattle. This study highlights X-chromosomal genetic variation as an evolutionary driver of innate immunity against mycobacterial infections, with implications for breeding strategies and host-directed tuberculosis therapies. The CYBB variant exemplifies how cattle subspecies divergence can illuminate conserved antimicrobial defense mechanisms in mammals.

与牛磺酸品种相比,Indicine牛对牛分枝杆菌感染表现出更强的抵抗力,揭示了牛结核病(bTB)抗性的不同遗传机制。先前的研究表明,细胞色素b-245 (CYBB)基因变异与结核分枝杆菌复合体(MTBC)感染的孟德尔易感性相关。本研究分析了258头母牛的x染色体序列,鉴定出CYBB基因的发散型错义变异(L237M)。这种变异在对照人群中出现频率很高。小鼠巨噬细胞的功能研究表明,CYBB L237M通过提高谷胱甘肽合成和谷胱甘肽过氧化物酶4的表达来减轻结核分枝杆菌诱导的铁凋亡。从机制上说,L237M取代增强了烟酰胺腺嘌呤二核苷酸磷酸(NADPH)氧化酶2 (NOX2)和p22phox复合物(NOX2-p22)的稳定性,这对吞噬体活性氧的产生和细菌清除至关重要。我们的研究结果表明,CYBB L237M通过抑制铁凋亡促进细胞内MTBC的消除,部分解释了牛对bTB的优异抗性。这项研究强调了x染色体遗传变异作为抗分枝杆菌感染先天免疫的进化驱动因素,对育种策略和宿主导向的结核病治疗具有重要意义。CYBB变体举例说明了牛亚种差异如何阐明哺乳动物中保守的抗微生物防御机制。
{"title":"The Indicine X-linked CYBBL237M Mutation Can Suppress Intracellular Infection with Tubercle Bacilli.","authors":"Haoxin Wang, Xiaoting Xia, Lulan Zeng, Jing Yang, Jing Han, David E MacHugh, Johannes A Lenstra, Yanliang Song, Ajiao Fan, Yifan Zhu, Zhenliang Zhu, Xinyan Zhang, Yingyu Chen, Jianlin Han, Chuzhao Lei, Ningbo Chen, Yong Zhang, Yuanpeng Gao","doi":"10.1093/gpbjnl/qzaf131","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf131","url":null,"abstract":"<p><p>Indicine cattle exhibit superior resistance to Mycobacterium bovis infection compared to taurine breeds, revealing divergent genetic mechanisms underlying bovine tuberculosis (bTB) resilience. Previous research has demonstrated that Cytochrome b-245 (CYBB) gene variants are associated with Mendelian susceptibility to Mycobacterium tuberculosis complex (MTBC) infections. In this study, we analyzed the X-chromosomal sequences from 258 female cattle and identified a divergent missense variant (L237M) in the CYBB gene. This variant occurs at high frequencies in indicine populations. Functional studies using murine macrophages revealed that CYBB  L237M mitigates M. tuberculosis-induced ferroptosis by elevating glutathione synthesis and glutathione peroxidase 4 expression. Mechanistically, the L237M substitution enhances the stability of the nicotinamide adenine dinucleotide phosphate (NADPH) oxidase 2 (NOX2) and p22phox complex (NOX2-p22), which is critical for the generation of phagosomal reactive oxygen species and bacterial clearance. Our findings demonstrate that CYBB  L237M promotes intracellular MTBC elimination through ferroptosis suppression, partially explaining the superior bTB resistance of indicine cattle. This study highlights X-chromosomal genetic variation as an evolutionary driver of innate immunity against mycobacterial infections, with implications for breeding strategies and host-directed tuberculosis therapies. The CYBB variant exemplifies how cattle subspecies divergence can illuminate conserved antimicrobial defense mechanisms in mammals.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Analyses and Challenges of Single-cell ATAC-seq. 单细胞ATAC-seq的计算分析与挑战。
IF 7.9 Pub Date : 2025-12-22 DOI: 10.1093/gpbjnl/qzaf115
Chenfei Wang, Jiaojiao Zhou, Hong Zhang, Zihan Zhuang, Gali Bai, Ming Tang, Song Liu, Tao Liu

Single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) has emerged as a powerful technique to study cell-specific epigenetic landscapes and to provide a multidimensional portrait of gene regulation. However, low genomic coverage per cell results in intrinsic data sparsity and missing-data issues, presenting unique methodological challenges. Consequently, numerous computational methods and techniques have been developed to address these challenges. This review provides a concise overview of published workflows for scATAC-seq analysis, covering preprocessing through downstream analysis including quality control, alignment, peak calling, dimensionality reduction, clustering, gene regulation score calculation, cell type annotation, and multiomics integration. Additionally, we survey key scATAC-seq databases that offer curated, accessible resources; discuss emerging deep-learning methods and Artificial Intelligence (AI) foundation models tailored to scATAC-seq data; and highlight recent advances in spatial ATAC-seq technologies and associated computational approaches. Our objective is to equip readers with a clear understanding of current scATAC-seq methodologies so they can select appropriate tools and construct customized workflows for exploring gene regulation and cellular diversity.

利用测序技术进行转座酶可及染色质单细胞检测(scATAC-seq)已经成为研究细胞特异性表观遗传景观和提供基因调控多维画像的一种强大技术。然而,每个细胞的低基因组覆盖率导致固有的数据稀疏和数据缺失问题,提出了独特的方法挑战。因此,已经开发了许多计算方法和技术来解决这些挑战。本文简要概述了已发表的scada -seq分析工作流程,包括从预处理到下游分析,包括质量控制、比对、峰调用、降维、聚类、基因调控评分计算、细胞类型注释和多组学整合。此外,我们调查了主要的scATAC-seq数据库,这些数据库提供了精心策划的、可访问的资源;讨论针对scATAC-seq数据的新兴深度学习方法和人工智能(AI)基础模型;并强调空间ATAC-seq技术和相关计算方法的最新进展。我们的目标是让读者清楚地了解当前的scATAC-seq方法,以便他们可以选择合适的工具并构建定制的工作流程来探索基因调控和细胞多样性。
{"title":"Computational Analyses and Challenges of Single-cell ATAC-seq.","authors":"Chenfei Wang, Jiaojiao Zhou, Hong Zhang, Zihan Zhuang, Gali Bai, Ming Tang, Song Liu, Tao Liu","doi":"10.1093/gpbjnl/qzaf115","DOIUrl":"10.1093/gpbjnl/qzaf115","url":null,"abstract":"<p><p>Single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) has emerged as a powerful technique to study cell-specific epigenetic landscapes and to provide a multidimensional portrait of gene regulation. However, low genomic coverage per cell results in intrinsic data sparsity and missing-data issues, presenting unique methodological challenges. Consequently, numerous computational methods and techniques have been developed to address these challenges. This review provides a concise overview of published workflows for scATAC-seq analysis, covering preprocessing through downstream analysis including quality control, alignment, peak calling, dimensionality reduction, clustering, gene regulation score calculation, cell type annotation, and multiomics integration. Additionally, we survey key scATAC-seq databases that offer curated, accessible resources; discuss emerging deep-learning methods and Artificial Intelligence (AI) foundation models tailored to scATAC-seq data; and highlight recent advances in spatial ATAC-seq technologies and associated computational approaches. Our objective is to equip readers with a clear understanding of current scATAC-seq methodologies so they can select appropriate tools and construct customized workflows for exploring gene regulation and cellular diversity.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12753137/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145575105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling Neonatal Pneumonia Microbiome by High-throughput Sequencing and Droplet Culturomics. 通过高通量测序和液滴培养揭示新生儿肺炎微生物组。
IF 7.9 Pub Date : 2025-12-22 DOI: 10.1093/gpbjnl/qzaf047
Zerui Wang 王则锐, Xin Cheng 程欣, Yibin Xu 徐义斌, Zhiyi Wang 汪之一, Liyan Ma 马立艳, Caiming Li 李埰明, Shize Jiang 姜世泽, Yuchen Li 黎雨尘, Shuilong Guo 郭水龙, Wenbin Du 杜文斌

Neonatal pneumonia is a leading cause of infant mortality worldwide; however, a lack of microbial profiling, especially of low-abundance species, makes accurate diagnosis challenging. Traditional methods can fail to capture the complexity of the neonatal respiratory microbiota, thereby obscuring its role in disease progression. Here, we describe a novel approach that combines high-throughput sequencing with droplet-based microfluidic cultivation to investigate microbiome shifts in neonates with pneumonia. Using 16S ribosomal RNA (rRNA) gene sequencing of 71 pneumonia cases and 49 controls, we identified 1009 genera, including 930 low-abundance taxa, which showed significant compositional differences between groups. Linear discriminant analysis effect size identified key pneumonia-associated genera, such as Streptococcus, Rothia, and Corynebacterium. Droplet-based cultivation recovered 299 strains from 94 taxa, including rare species and ESKAPE pathogens, thereby supporting targeted antimicrobial management. Host-pathogen interaction assays showed that Rothia and Corynebacterium induced inflammation in lung epithelial cells, likely via dysregulation of the PI3K-Akt pathway. Integrating these marker taxa with clinical factors, such as gestational age and delivery type, offers the potential for precise diagnosis and treatment. The recovery of diverse species can support the construction of a biobank of neonatal respiratory microbiota to advance mechanistic studies and therapeutic strategies.

新生儿肺炎是全世界婴儿死亡的主要原因;然而,缺乏微生物谱,特别是低丰度的物种,使准确的诊断具有挑战性。传统方法可能无法捕捉新生儿呼吸微生物群的复杂性,从而模糊其在疾病进展中的作用。我们描述了一种将高通量测序与基于微流体培养相结合的新方法,以研究肺炎新生儿微生物组的变化。对71例肺炎患者和49例对照组进行16S核糖体RNA (rRNA)基因测序,鉴定出1009个属,其中低丰度类群930个,组间组成差异显著。线性判别分析效应大小分析确定了关键的肺炎相关属,如链球菌、罗氏菌和棒状杆菌。基于液滴的培养从94个分类群中回收了299株菌株,包括稀有物种和ESKAPE病原体,从而支持有针对性的抗菌管理。宿主-病原体相互作用实验显示,罗氏杆菌和棒状杆菌诱导肺上皮细胞炎症,可能通过PI3K-Akt通路失调。将这些标记分类群与临床因素(如胎龄和分娩类型)相结合,为精确诊断和治疗提供了可能。不同物种的恢复可以支持新生儿呼吸微生物群生物库的建设,以推进机制研究和治疗策略。
{"title":"Unveiling Neonatal Pneumonia Microbiome by High-throughput Sequencing and Droplet Culturomics.","authors":"Zerui Wang 王则锐, Xin Cheng 程欣, Yibin Xu 徐义斌, Zhiyi Wang 汪之一, Liyan Ma 马立艳, Caiming Li 李埰明, Shize Jiang 姜世泽, Yuchen Li 黎雨尘, Shuilong Guo 郭水龙, Wenbin Du 杜文斌","doi":"10.1093/gpbjnl/qzaf047","DOIUrl":"10.1093/gpbjnl/qzaf047","url":null,"abstract":"<p><p>Neonatal pneumonia is a leading cause of infant mortality worldwide; however, a lack of microbial profiling, especially of low-abundance species, makes accurate diagnosis challenging. Traditional methods can fail to capture the complexity of the neonatal respiratory microbiota, thereby obscuring its role in disease progression. Here, we describe a novel approach that combines high-throughput sequencing with droplet-based microfluidic cultivation to investigate microbiome shifts in neonates with pneumonia. Using 16S ribosomal RNA (rRNA) gene sequencing of 71 pneumonia cases and 49 controls, we identified 1009 genera, including 930 low-abundance taxa, which showed significant compositional differences between groups. Linear discriminant analysis effect size identified key pneumonia-associated genera, such as Streptococcus, Rothia, and Corynebacterium. Droplet-based cultivation recovered 299 strains from 94 taxa, including rare species and ESKAPE pathogens, thereby supporting targeted antimicrobial management. Host-pathogen interaction assays showed that Rothia and Corynebacterium induced inflammation in lung epithelial cells, likely via dysregulation of the PI3K-Akt pathway. Integrating these marker taxa with clinical factors, such as gestational age and delivery type, offers the potential for precise diagnosis and treatment. The recovery of diverse species can support the construction of a biobank of neonatal respiratory microbiota to advance mechanistic studies and therapeutic strategies.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12721867/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144176288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics, proteomics & bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1