首页 > 最新文献

Computational and structural biotechnology journal最新文献

英文 中文
CirRFKB: A knowledgebase of circadian-related risk factors for cancer pathogenesis and personalized medicine. CirRFKB:一个与昼夜节律相关的癌症发病危险因素和个体化治疗的知识库。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-22 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.051
Jiao Wang, Hui Zong, Yingbo Zhang, Xingyun Liu, Ke Shen, Xiaoyu Li, Rongrong Wu, Min Jiang, Daniel Rivero Cebrián, Juan Ramón Rabuñal Dopico, Bairong Shen

Circadian rhythms regulate numerous physiological and biochemical processes in humans, and their disruption is linked to elevated cancer risk and progression. Although substantial research has elucidated interactions between circadian mechanisms and cancer pathways, these findings remain fragmented and poorly integrated, impeding a holistic understanding. To address this gap, we developed the Circadian-Related Risk Factor Knowledgebase for Cancer (CirRFKB), a manually curated repository documenting validated associations between the circadian clock and cancer. CirRFKB curates data from 471 articles, encompassing 46 cancer types and 4052 records, categorizing risk factors into 1449 single factors and 340 combinations. Single factors were categorized into 681 genetic factors, 106 environmental factors, 244 physiological factors, and 418 behavioral factors. These factors were further classified as 254 protective factors, 323 risk factors, 291 no-influencing factors, and 921 unclear factors. The user-friendly interface enables researchers to explore, visualize, and retrieve data through comprehensive browsing and query tools. CirRFKB provides a foundational resource that structures circadian-cancer interactions, offering systematic evidence to advance clinical applications in deep phenotyping for precision oncology and the optimization of chronotherapy. CirRFKB is publicly accessible at: http://bioinf.org.cn:9876/.

昼夜节律调节着人类的许多生理和生化过程,它们的破坏与癌症风险和进展的增加有关。尽管大量的研究已经阐明了昼夜节律机制和癌症途径之间的相互作用,但这些发现仍然是零散的,缺乏整合,阻碍了整体的理解。为了解决这一差距,我们开发了昼夜节律相关的癌症风险因素知识库(CirRFKB),这是一个手动管理的存储库,记录了生物钟与癌症之间的有效关联。CirRFKB整理了471篇文章的数据,包括46种癌症类型和4052条记录,将风险因素分为1449个单一因素和340个组合。单因素分为遗传因素681个,环境因素106个,生理因素244个,行为因素418个。其中保护性因素254个,危险因素323个,无影响因素291个,不明确因素921个。用户友好的界面使研究人员能够通过全面的浏览和查询工具来探索,可视化和检索数据。CirRFKB提供了构建昼夜节律与癌症相互作用的基础资源,为推进精准肿瘤学深度表型的临床应用和优化时间疗法提供了系统证据。CirRFKB是公开访问:http://bioinf.org.cn:9876/。
{"title":"CirRFKB: A knowledgebase of circadian-related risk factors for cancer pathogenesis and personalized medicine.","authors":"Jiao Wang, Hui Zong, Yingbo Zhang, Xingyun Liu, Ke Shen, Xiaoyu Li, Rongrong Wu, Min Jiang, Daniel Rivero Cebrián, Juan Ramón Rabuñal Dopico, Bairong Shen","doi":"10.1016/j.csbj.2025.11.051","DOIUrl":"10.1016/j.csbj.2025.11.051","url":null,"abstract":"<p><p>Circadian rhythms regulate numerous physiological and biochemical processes in humans, and their disruption is linked to elevated cancer risk and progression. Although substantial research has elucidated interactions between circadian mechanisms and cancer pathways, these findings remain fragmented and poorly integrated, impeding a holistic understanding. To address this gap, we developed the Circadian-Related Risk Factor Knowledgebase for Cancer (CirRFKB), a manually curated repository documenting validated associations between the circadian clock and cancer. CirRFKB curates data from 471 articles, encompassing 46 cancer types and 4052 records, categorizing risk factors into 1449 single factors and 340 combinations. Single factors were categorized into 681 genetic factors, 106 environmental factors, 244 physiological factors, and 418 behavioral factors. These factors were further classified as 254 protective factors, 323 risk factors, 291 no-influencing factors, and 921 unclear factors. The user-friendly interface enables researchers to explore, visualize, and retrieve data through comprehensive browsing and query tools. CirRFKB provides a foundational resource that structures circadian-cancer interactions, offering systematic evidence to advance clinical applications in deep phenotyping for precision oncology and the optimization of chronotherapy. CirRFKB is publicly accessible at: http://bioinf.org.cn:9876/.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5326-5334"},"PeriodicalIF":4.1,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12686628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145721480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A formal approach to the hierarchical structures of microbial communities with negative interactions. 具有负相互作用的微生物群落的等级结构的正式方法。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.036
Beatrice Ruth, Bashar Ibrahim, Peter Dittrich

Microbial communities typically consist of numerous species that coexist through intricate mutual dependencies. Understanding the structure of these communities and the interactions among their species is essential for explaining their functions and predicting their behavior. In this study, we follow the idea that a community organizes itself into a hierarchy of potentially persistent sub-communities. Previously, this hierarchy was described using Chemical Organization Theory (COT). However, that approach did not account for negative interactions. Here, we enhance the theory by incorporating negative interactions through an inhibitory resource called a toxin. For simplicity, we assume that a taxon sensitive to a toxin cannot coexist with a taxon that produces that toxin. Our results demonstrate that introducing a toxin reduces the number of organizations, with the extent of this reduction depending on various modeling parameters. Further, we show that the usage of essential resources leads to a computationally NP-hard transformation problem into direct taxa interactions. Additionally, we demonstrate that the number of measurements required to infer all persistent subspaces increases. We determine which groups of species are mutually excluded due to toxin interactions. Besides toxic interactions, it is also possible to infer cross-feeding aspects of the microbial community, for which a potential algorithm is outlined and illustrated by an example.

微生物群落通常由许多物种组成,它们通过复杂的相互依赖关系共存。了解这些群落的结构及其物种之间的相互作用对于解释它们的功能和预测它们的行为至关重要。在这项研究中,我们遵循这样的想法:一个社区将自己组织成一个由潜在的持久子社区组成的层次结构。以前,这种层次结构是用化学组织理论(COT)来描述的。然而,这种方法并没有解释负面的相互作用。在这里,我们通过一种被称为毒素的抑制资源来整合负面相互作用,从而增强了这一理论。为简单起见,我们假设对某种毒素敏感的分类群不能与产生这种毒素的分类群共存。我们的结果表明,引入毒素减少了组织的数量,这种减少的程度取决于各种建模参数。此外,我们表明,基本资源的使用导致计算NP-hard转换问题到直接的分类群相互作用。此外,我们证明了推断所有持久子空间所需的测量数量增加了。我们确定哪组物种由于毒素相互作用而相互排斥。除了毒性相互作用外,还可以推断微生物群落的交叉摄食方面,为此概述了一种潜在的算法并通过实例说明。
{"title":"A formal approach to the hierarchical structures of microbial communities with negative interactions.","authors":"Beatrice Ruth, Bashar Ibrahim, Peter Dittrich","doi":"10.1016/j.csbj.2025.11.036","DOIUrl":"10.1016/j.csbj.2025.11.036","url":null,"abstract":"<p><p>Microbial communities typically consist of numerous species that coexist through intricate mutual dependencies. Understanding the structure of these communities and the interactions among their species is essential for explaining their functions and predicting their behavior. In this study, we follow the idea that a community organizes itself into a hierarchy of potentially persistent sub-communities. Previously, this hierarchy was described using Chemical Organization Theory (COT). However, that approach did not account for negative interactions. Here, we enhance the theory by incorporating negative interactions through an inhibitory resource called a toxin. For simplicity, we assume that a taxon sensitive to a toxin cannot coexist with a taxon that produces that toxin. Our results demonstrate that introducing a toxin reduces the number of organizations, with the extent of this reduction depending on various modeling parameters. Further, we show that the usage of essential resources leads to a computationally NP-hard transformation problem into direct taxa interactions. Additionally, we demonstrate that the number of measurements required to infer all persistent subspaces increases. We determine which groups of species are mutually excluded due to toxin interactions. Besides toxic interactions, it is also possible to infer cross-feeding aspects of the microbial community, for which a potential algorithm is outlined and illustrated by an example.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5561-5574"},"PeriodicalIF":4.1,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12731272/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145833148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges in predicting protein-protein interactions of understudied viruses: Arenavirus-human interactions. 预测未充分研究的病毒的蛋白质-蛋白质相互作用的挑战:沙粒病毒-人类相互作用。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.037
Harshita Sahni, Sarah Michelle Crotzer, Juston Moore, Steven S Branda, Trilce Estrada, S Gnanakaran

Understanding protein-protein interactions (PPIs) between viruses and host organisms is crucial for uncovering infection mechanisms and identifying potential therapeutic targets. The ability to generalize PPI predictive models across understudied viruses presents a significant challenge. In this work, we use arenavirus-human PPIs to illustrate the difficulties associated with model generalization, which are compounded by a lack of both positive and negative data. We employ a Transfer Learning approach to investigate arenavirus-human PPIs by utilizing models trained on better-studied virus-human and human-human PPIs. Additionally, we curate and assess four types of negative sampling datasets to evaluate their impact on model performance. Despite the overall high accuracies (93-99 %) and AUPRC scores (0.8-0.9) appearing promising, further analysis indicates that these performance metrics can be misleading due to data leakage, data bias, and overfitting, especially concerning under-represented viral proteins. We reveal these gaps and assess the impact of data imbalance using standard k-fold cross-validation and Independent Blind Testing with a Balanced Dataset, resulting in a drop in accuracy below 50 %. We propose a viral protein-specific evaluation framework that categorizes viral proteins into majority and minority classes based on their representation in the dataset, enabling comparison of model performance across these groups using balanced accuracies. This framework offers a more robust evaluation of model generalizability, addressing biases inherent in standard evaluation techniques and paving the way for more reliable PPI prediction models for understudied viruses.

了解病毒与宿主生物之间的蛋白-蛋白相互作用(PPIs)对于揭示感染机制和确定潜在的治疗靶点至关重要。在研究不足的病毒中推广PPI预测模型的能力提出了一个重大挑战。在这项工作中,我们使用沙粒病毒-人类ppi来说明与模型泛化相关的困难,这些困难由于缺乏正面和负面数据而变得更加复杂。我们采用迁移学习方法来研究沙粒病毒-人类PPIs,利用在研究得更好的病毒-人类和人-人类PPIs上训练的模型。此外,我们整理和评估四种类型的负抽样数据集,以评估它们对模型性能的影响。尽管总体上的高准确度(93-99 %)和AUPRC得分(0.8-0.9)看起来很有希望,但进一步的分析表明,由于数据泄漏、数据偏差和过拟合,特别是在涉及代表性不足的病毒蛋白时,这些性能指标可能会产生误导。我们揭示了这些差距,并使用标准的k-fold交叉验证和使用平衡数据集的独立盲测来评估数据不平衡的影响,导致准确性下降到50% %以下。我们提出了一个病毒蛋白特异性评估框架,该框架根据病毒蛋白在数据集中的表现将病毒蛋白分为多数和少数类,从而能够使用平衡的准确性比较这些组之间的模型性能。该框架提供了对模型通用性的更稳健的评估,解决了标准评估技术中固有的偏差,并为研究不足的病毒建立更可靠的PPI预测模型铺平了道路。
{"title":"Challenges in predicting protein-protein interactions of understudied viruses: Arenavirus-human interactions.","authors":"Harshita Sahni, Sarah Michelle Crotzer, Juston Moore, Steven S Branda, Trilce Estrada, S Gnanakaran","doi":"10.1016/j.csbj.2025.11.037","DOIUrl":"10.1016/j.csbj.2025.11.037","url":null,"abstract":"<p><p>Understanding protein-protein interactions (PPIs) between viruses and host organisms is crucial for uncovering infection mechanisms and identifying potential therapeutic targets. The ability to generalize PPI predictive models across understudied viruses presents a significant challenge. In this work, we use arenavirus-human PPIs to illustrate the difficulties associated with model generalization, which are compounded by a lack of both positive and negative data. We employ a Transfer Learning approach to investigate arenavirus-human PPIs by utilizing models trained on better-studied virus-human and human-human PPIs. Additionally, we curate and assess four types of negative sampling datasets to evaluate their impact on model performance. Despite the overall high accuracies (93-99 %) and AUPRC scores (0.8-0.9) appearing promising, further analysis indicates that these performance metrics can be misleading due to data leakage, data bias, and overfitting, especially concerning under-represented viral proteins. We reveal these gaps and assess the impact of data imbalance using standard k-fold cross-validation and Independent Blind Testing with a Balanced Dataset, resulting in a drop in accuracy below 50 %. We propose a viral protein-specific evaluation framework that categorizes viral proteins into majority and minority classes based on their representation in the dataset, enabling comparison of model performance across these groups using balanced accuracies. This framework offers a more robust evaluation of model generalizability, addressing biases inherent in standard evaluation techniques and paving the way for more reliable PPI prediction models for understudied viruses.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5401-5412"},"PeriodicalIF":4.1,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12703866/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145767348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised domain adaptation of protein language model based solely on positive enzyme-reaction pairs. 仅基于正酶反应对的蛋白质语言模型的自监督域自适应。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.045
Tomoya Okuno, Naoaki Ono, Md Altaf-Ul-Amin, Shigehiko Kanaya

There is growing interest in developing predictive models of enzyme catalytic properties that leverage activity data spanning diverse enzyme families. A fundamental challenge lies in the inherent biases of public biochemical databases. These databases predominantly catalog valid enzyme activities, rarely include negative instances, and report quantitative catalytic parameters for only a relatively small subset of enzymes. Such limitations pose a major obstacle to supervised learning of enzyme catalytic properties. One existing approach for model training involves generating synthetic negative enzyme-activity pairs by recombining existing enzymes and their activity information, particularly substrates or chemical reactions, that were not originally associated within datasets. However, it remains unclear whether the generated negative examples are truly inactive or merely unobserved active instances. To build a model that captures functional properties across diverse enzyme families while avoiding reliance on negative examples, this paper introduces a self-supervised domain adaptation methodology for pre-trained protein language models, solely based on positive enzyme-reaction pairs. The enzyme representations obtained from the adapted protein language model achieved superior or at least competitive performance compared to those from an existing method that relies on synthetic negatives, in both the turnover number prediction task for natural reactions of wild-type enzymes and the activity prediction task for family-wide enzyme-substrate specificity screening datasets. Overall, our approach represents a methodological advancement that eliminates the need for synthetic negatives and provides a scalable framework for leveraging the growing enzyme activity data in biochemical databases.

人们对开发酶催化特性的预测模型越来越感兴趣,该模型利用了跨越不同酶家族的活性数据。一个根本性的挑战在于公共生化数据库的固有偏见。这些数据库主要目录有效的酶活性,很少包括负面的实例,并报告定量催化参数的酶的一个相对较小的子集。这些限制对酶催化性质的监督学习构成了主要障碍。一种现有的模型训练方法是通过重组现有的酶及其活性信息,特别是底物或化学反应,生成合成负酶活性对,这些酶活性对最初在数据集中没有关联。然而,目前尚不清楚产生的负面例子是否真的不活跃或仅仅是未观察到的活跃实例。为了建立一个能够捕获不同酶家族功能特性的模型,同时避免依赖于负例,本文为预训练的蛋白质语言模型引入了一种自监督结构域自适应方法,该方法仅基于正酶反应对。与依赖合成阴性的现有方法相比,从适应性蛋白质语言模型获得的酶表示在野生型酶的自然反应的营业额预测任务和全家族酶-底物特异性筛选数据集的活性预测任务中取得了优越或至少有竞争力的表现。总的来说,我们的方法代表了一种方法上的进步,它消除了对合成阴性的需要,并为利用生化数据库中不断增长的酶活性数据提供了一个可扩展的框架。
{"title":"Self-supervised domain adaptation of protein language model based solely on positive enzyme-reaction pairs.","authors":"Tomoya Okuno, Naoaki Ono, Md Altaf-Ul-Amin, Shigehiko Kanaya","doi":"10.1016/j.csbj.2025.11.045","DOIUrl":"10.1016/j.csbj.2025.11.045","url":null,"abstract":"<p><p>There is growing interest in developing predictive models of enzyme catalytic properties that leverage activity data spanning diverse enzyme families. A fundamental challenge lies in the inherent biases of public biochemical databases. These databases predominantly catalog valid enzyme activities, rarely include negative instances, and report quantitative catalytic parameters for only a relatively small subset of enzymes. Such limitations pose a major obstacle to supervised learning of enzyme catalytic properties. One existing approach for model training involves generating synthetic negative enzyme-activity pairs by recombining existing enzymes and their activity information, particularly substrates or chemical reactions, that were not originally associated within datasets. However, it remains unclear whether the generated negative examples are truly inactive or merely unobserved active instances. To build a model that captures functional properties across diverse enzyme families while avoiding reliance on negative examples, this paper introduces a self-supervised domain adaptation methodology for pre-trained protein language models, solely based on positive enzyme-reaction pairs. The enzyme representations obtained from the adapted protein language model achieved superior or at least competitive performance compared to those from an existing method that relies on synthetic negatives, in both the turnover number prediction task for natural reactions of wild-type enzymes and the activity prediction task for family-wide enzyme-substrate specificity screening datasets. Overall, our approach represents a methodological advancement that eliminates the need for synthetic negatives and provides a scalable framework for leveraging the growing enzyme activity data in biochemical databases.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5441-5449"},"PeriodicalIF":4.1,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12712682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145803408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PanARGMiner (Pan-Genomic Antimicrobial Resistance Gene Miner): An advanced feature selection framework for extracting key resistance genes from pan-genomic datasets. PanARGMiner (Pan-Genomic Antimicrobial Resistance Gene Miner):一个先进的特征选择框架,用于从泛基因组数据集中提取关键耐药基因。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.046
Yu-Cheng Chen, Ming-Ren Yang, Yu-Wei Wu

Identifying antimicrobial resistance (AMR)-related biomarkers from large-scale genomic datasets is often akin to finding a needle in a haystack. With pan-genomic data containing more than 100,000 gene sequences, isolating features that truly drive resistance remains a major challenge in computational biology. Here we present PanARGMiner, a machine learning-based feature selection framework designed to robustly extract highly relevant and informative biomarkers from high-dimensional biological data. PanARGMiner uses an ensemble-based feature selection strategy to select highly informative and compact feature subsets. It then utilizes repeated iterations to ensure the stability and reliability of the proposed framework, enabling PanARGMiner to generate significantly reduced features with comparable prediction performance compared to those obtained with other feature selection algorithms. Applying PanARGMiner to bacterial pan-genomic antimicrobial resistance datasets successfully extracted as few as one to ten candidate AMR biomarkers from datasets with more than 100,000 genes for five common pathogens. Although many of the extracted candidate AMR biomarkers are well-known resistance genes, proteins not known to be associated with AMR mechanisms, including functionally uncharacterized hypothetical proteins, were also extracted. This indicates the potential of PanARGMiner in revealing both established and novel mechanisms of antibiotic resistance, thus providing actionable insights for biomarker discovery, functional genomics, and precision medicine based on complex data. Its ability to uncover both known and uncharacterized resistance-related features offers new opportunities for research and clinical applications in combating AMR.

从大规模基因组数据集中识别与抗菌素耐药性(AMR)相关的生物标志物往往类似于大海捞针。由于泛基因组数据包含超过100,000个基因序列,分离真正驱动耐药性的特征仍然是计算生物学的主要挑战。在这里,我们提出了PanARGMiner,一个基于机器学习的特征选择框架,旨在从高维生物数据中稳健地提取高度相关和信息丰富的生物标志物。PanARGMiner使用基于集成的特征选择策略来选择高信息量和紧凑的特征子集。然后,它利用重复迭代来确保所提出框架的稳定性和可靠性,使PanARGMiner能够生成与其他特征选择算法相比具有相当预测性能的显著减少的特征。将PanARGMiner应用于细菌泛基因组抗微生物药物耐药性数据集,成功地从5种常见病原体超过10万个基因的数据集中提取出1 - 10个候选AMR生物标志物。虽然提取的许多候选AMR生物标志物是众所周知的耐药基因,但也提取了与AMR机制相关的未知蛋白质,包括功能未表征的假设蛋白质。这表明PanARGMiner在揭示已建立的和新的抗生素耐药机制方面具有潜力,从而为基于复杂数据的生物标志物发现、功能基因组学和精准医学提供可操作的见解。它能够发现已知和未表征的耐药相关特征,为抗抗生素耐药性的研究和临床应用提供了新的机会。
{"title":"PanARGMiner (Pan-Genomic Antimicrobial Resistance Gene Miner): An advanced feature selection framework for extracting key resistance genes from pan-genomic datasets.","authors":"Yu-Cheng Chen, Ming-Ren Yang, Yu-Wei Wu","doi":"10.1016/j.csbj.2025.11.046","DOIUrl":"10.1016/j.csbj.2025.11.046","url":null,"abstract":"<p><p>Identifying antimicrobial resistance (AMR)-related biomarkers from large-scale genomic datasets is often akin to finding a needle in a haystack. With pan-genomic data containing more than 100,000 gene sequences, isolating features that truly drive resistance remains a major challenge in computational biology. Here we present PanARGMiner, a machine learning-based feature selection framework designed to robustly extract highly relevant and informative biomarkers from high-dimensional biological data. PanARGMiner uses an ensemble-based feature selection strategy to select highly informative and compact feature subsets. It then utilizes repeated iterations to ensure the stability and reliability of the proposed framework, enabling PanARGMiner to generate significantly reduced features with comparable prediction performance compared to those obtained with other feature selection algorithms. Applying PanARGMiner to bacterial pan-genomic antimicrobial resistance datasets successfully extracted as few as one to ten candidate AMR biomarkers from datasets with more than 100,000 genes for five common pathogens. Although many of the extracted candidate AMR biomarkers are well-known resistance genes, proteins not known to be associated with AMR mechanisms, including functionally uncharacterized hypothetical proteins, were also extracted. This indicates the potential of PanARGMiner in revealing both established and novel mechanisms of antibiotic resistance, thus providing actionable insights for biomarker discovery, functional genomics, and precision medicine based on complex data. Its ability to uncover both known and uncharacterized resistance-related features offers new opportunities for research and clinical applications in combating AMR.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5363-5374"},"PeriodicalIF":4.1,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12699266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145755479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-calorie diet intervention ameliorates gut microbiota dysbiosis and metabolic changes in obese patients with type 2 diabetes under standard care. 低热量饮食干预改善标准治疗下肥胖2型糖尿病患者肠道菌群失调和代谢变化。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-20 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.043
Mongkontida Umphonsathien, Pornsawan Prutanopajai, Thanya Cheibchalard, Naraporn Somboonna

Background: Dietary interventions can modulate the gut bacteria community (microbiota) and offer a complementary strategy for improving metabolic control in type 2 diabetes (T2D). This pilot study evaluated clinical clinical outcomes and gut microbiota changes following a structured low-calorie diet (LCD) intervention in obese T2D individuals under standard care.

Methods: Twenty obese T2D patients were randomized into an intervention group (n = 15) (6-week 1000-1200 kcal/day of glycemic and metabolic control LCD), or a matched control group (n = 5). Clinical parameters and fecal microbiota profiles were assessed at baseline, week 6, and week 12.

Results: The intervention group showed clinical trends toward improved glycemic and metabolic parameters, including reductions in fasting plasma glucose (FPG), hemoglobin A1c (HbA1c), and lipid levels (i.e., cholesterol) (P > 0.05), accompanied by significant loss of body weight, body mass index (BMI), and body fat (P < 0.05). Four intervention participants (26.7 %) achieved normoglycemia without glucose-lowering medication. Gut microbiota analyses revealed significant alterations in alpha and beta diversity over time in the intervention group (AMOVA: P(control baseline, intervention 12-week) = 0.025 and P(intervention baseline, intervention 12-week) = 0.002), with increased abundance of beneficial genera i.e. Streptococcus, Bifidobacterium and Lactobacillus, and enrichment of Actinobacteria, Candidatus Saccharibacteria (TM7), and Firmicutes at week 12. Linear discriminant analysis effect size (LEfSe) analysis identified distinct microbial biomarkers differentiating groups. Microbial functional predictions revealed significantly decreased inferred activity in pathways related to adipocytokine signaling, D-glutamine and D-glutamate metabolism, and type I diabetes mellitus (P < 0.05); however, these predictions were computational inferences and not experimentally validated.

Conclusion: A structured LCD combined with standard care led to metabolic improvement and remodeling of gut microbiota trend in obese Thai individuals with T2D. The findings support the dietary interventions to beneficially modulate the gut microbiome and metabolic health, while highlighting the need for larger studies and functional validation.

背景:饮食干预可以调节肠道细菌群落(微生物群),并为改善2型糖尿病(T2D)的代谢控制提供补充策略。本初步研究评估了标准治疗下肥胖t2dm患者在结构化低热量饮食(LCD)干预后的临床临床结果和肠道菌群变化。方法:将20例肥胖T2D患者随机分为干预组(n = 15)(6周1000-1200 kcal/天血糖代谢控制LCD)和匹配对照组(n = 5)。在基线、第6周和第12周评估临床参数和粪便微生物群概况。结果:干预组表现出改善血糖和代谢参数的临床趋势,包括空腹血糖(FPG)、血红蛋白A1c (HbA1c)和脂质水平(即胆固醇)的降低(P > 0.05),并伴有体重、体重指数(BMI)和体脂的显著下降(P P(对照基线,干预12周)= 0.025,P(干预基线,干预12周)= 0.002),有益菌如链球菌的丰度增加。双歧杆菌和乳酸菌,放线菌、糖酵母菌(TM7)和厚壁菌门在第12周的富集。线性判别分析效应大小(LEfSe)分析确定了不同的微生物生物标志物区分群体。微生物功能预测显示,脂肪细胞因子信号、d -谷氨酰胺和d -谷氨酸代谢和I型糖尿病相关途径的推断活性显著降低(P )。结论:结构化LCD结合标准护理导致肥胖的泰国糖尿病患者代谢改善和肠道微生物群重塑趋势。研究结果支持饮食干预有益调节肠道微生物群和代谢健康,同时强调需要进行更大规模的研究和功能验证。
{"title":"Low-calorie diet intervention ameliorates gut microbiota dysbiosis and metabolic changes in obese patients with type 2 diabetes under standard care.","authors":"Mongkontida Umphonsathien, Pornsawan Prutanopajai, Thanya Cheibchalard, Naraporn Somboonna","doi":"10.1016/j.csbj.2025.11.043","DOIUrl":"10.1016/j.csbj.2025.11.043","url":null,"abstract":"<p><strong>Background: </strong>Dietary interventions can modulate the gut bacteria community (microbiota) and offer a complementary strategy for improving metabolic control in type 2 diabetes (T2D). This pilot study evaluated clinical clinical outcomes and gut microbiota changes following a structured low-calorie diet (LCD) intervention in obese T2D individuals under standard care.</p><p><strong>Methods: </strong>Twenty obese T2D patients were randomized into an intervention group (n = 15) (6-week 1000-1200 kcal/day of glycemic and metabolic control LCD), or a matched control group (n = 5). Clinical parameters and fecal microbiota profiles were assessed at baseline, week 6, and week 12.</p><p><strong>Results: </strong>The intervention group showed clinical trends toward improved glycemic and metabolic parameters, including reductions in fasting plasma glucose (FPG), hemoglobin A1c (HbA1c), and lipid levels (i.e., cholesterol) (<i>P</i> > 0.05), accompanied by significant loss of body weight, body mass index (BMI), and body fat (<i>P</i> < 0.05). Four intervention participants (26.7 %) achieved normoglycemia without glucose-lowering medication. Gut microbiota analyses revealed significant alterations in alpha and beta diversity over time in the intervention group (AMOVA: <i>P</i>(control baseline, intervention 12-week) = 0.025 and <i>P</i>(intervention baseline, intervention 12-week) = 0.002), with increased abundance of beneficial genera i.e. <i>Streptococcus</i>, <i>Bifidobacterium</i> and <i>Lactobacillus</i>, and enrichment of <i>Actinobacteria</i>, <i>Candidatus Saccharibacteria</i> (TM7), and <i>Firmicutes</i> at week 12. Linear discriminant analysis effect size (LEfSe) analysis identified distinct microbial biomarkers differentiating groups. Microbial functional predictions revealed significantly decreased inferred activity in pathways related to adipocytokine signaling, D-glutamine and D-glutamate metabolism, and type I diabetes mellitus (<i>P</i> < 0.05); however, these predictions were computational inferences and not experimentally validated.</p><p><strong>Conclusion: </strong>A structured LCD combined with standard care led to metabolic improvement and remodeling of gut microbiota trend in obese Thai individuals with T2D. The findings support the dietary interventions to beneficially modulate the gut microbiome and metabolic health, while highlighting the need for larger studies and functional validation.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5307-5317"},"PeriodicalIF":4.1,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12686632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145721486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ExoOrb: A novel visual and analytical system for therapeutic extracellular vesicles metrics. ExoOrb:一种用于治疗性细胞外囊泡测量的新型视觉和分析系统。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-19 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.038
Touseef Ur Rehman, Muhammad Rameez Ur Rahman, Weihua Tang, Sebastiano Vascon, Pei Jiang, Yu Liu, Senyi Gong, Xun Wan, Ali Mohsin, Meijin Guo

Extracellular vesicles (EVs) are naturally secreted nanoscale mediators of intercellular communication, showing potential for therapeutic and functional food applications. Although many EVs are being isolated with claims of therapeutic benefits, the evaluation criteria require extensive resources and time, often resulting in futile outcomes. This work addresses this gap by developing a visual and quantitative system using monk fruit cell-derived EVs (MFEVs) as a model to efficiently select the most suitable therapeutic EVs by analyzing their characterization parameters. This approach saves valuable resources and time. To generate variations, MFEVs were isolated using eight different techniques: ultracentrifugation, ultrafiltration, polyethylene glycol (PEG) precipitation (8 %, 10 %, 15 %, and 20 %), anion-exchange chromatography, and a novel combined ultrafiltration-precipitation method. Following isolation, their physicochemical properties, biochemical composition, and bioactivity were characterized, and their dose-dependent anticancer effects were evaluated across multiple cancer cell lines. Next, using data from the correlative statistics of anticancer activity with characterization parameters, "ExoOrb" is developed. It is an analytical multicriteria decision-making system that objectively ranks the therapeutic potential of EVs by employing factor normalization, weighted scoring, and multidimensional visualizations. The system has been validated using both the original dataset and synthetic datasets. The original dataset identified PEG 10 %-MFEVs as more effective therapeutically, and the synthetic dataset confirmed ExoOrb's ability for metrisizing EVs across multiple EVs types. To our knowledge, ExoOrb is the first potentially universal framework for evaluating the therapeutic potential of EVs based on characterization parameters, providing a reliable tool for scientific and therapeutic research through standardized, data-driven optimization.

细胞外囊泡(EVs)是一种自然分泌的纳米级细胞间通讯介质,在治疗和功能性食品应用方面具有潜力。尽管许多ev被分离出来并声称具有治疗效果,但评估标准需要大量的资源和时间,往往导致徒劳的结果。这项工作通过开发一种视觉和定量系统来解决这一空白,该系统使用罗汉果细胞衍生的ev (mfev)作为模型,通过分析其表征参数来有效地选择最合适的治疗ev。这种方法节省了宝贵的资源和时间。为了产生变化,使用8种不同的技术分离mfev:超离心、超滤、聚乙二醇(PEG)沉淀法(8 %、10 %、15 %和20 %)、阴离子交换色谱法和一种新型的超滤-沉淀法。分离后,对其理化性质、生化成分和生物活性进行了表征,并在多种癌细胞系中评估了它们的剂量依赖性抗癌作用。接下来,利用抗癌活性的相关统计数据和表征参数,开发了“ExoOrb”。这是一个分析性的多标准决策系统,通过采用因素归一化、加权评分和多维可视化,客观地对ev的治疗潜力进行排名。该系统已使用原始数据集和合成数据集进行了验证。原始数据集确定PEG 10 %-MFEVs在治疗上更有效,合成数据集证实了ExoOrb在多种电动汽车类型中衡量电动汽车的能力。据我们所知,ExoOrb是第一个基于表征参数评估电动汽车治疗潜力的潜在通用框架,通过标准化、数据驱动的优化,为科学和治疗研究提供可靠的工具。
{"title":"ExoOrb: A novel visual and analytical system for therapeutic extracellular vesicles metrics.","authors":"Touseef Ur Rehman, Muhammad Rameez Ur Rahman, Weihua Tang, Sebastiano Vascon, Pei Jiang, Yu Liu, Senyi Gong, Xun Wan, Ali Mohsin, Meijin Guo","doi":"10.1016/j.csbj.2025.11.038","DOIUrl":"10.1016/j.csbj.2025.11.038","url":null,"abstract":"<p><p>Extracellular vesicles (EVs) are naturally secreted nanoscale mediators of intercellular communication, showing potential for therapeutic and functional food applications. Although many EVs are being isolated with claims of therapeutic benefits, the evaluation criteria require extensive resources and time, often resulting in futile outcomes. This work addresses this gap by developing a visual and quantitative system using monk fruit cell-derived EVs (MFEVs) as a model to efficiently select the most suitable therapeutic EVs by analyzing their characterization parameters. This approach saves valuable resources and time. To generate variations, MFEVs were isolated using eight different techniques: ultracentrifugation, ultrafiltration, polyethylene glycol (PEG) precipitation (8 %, 10 %, 15 %, and 20 %), anion-exchange chromatography, and a novel combined ultrafiltration-precipitation method. Following isolation, their physicochemical properties, biochemical composition, and bioactivity were characterized, and their dose-dependent anticancer effects were evaluated across multiple cancer cell lines. Next, using data from the correlative statistics of anticancer activity with characterization parameters, \"ExoOrb\" is developed. It is an analytical multicriteria decision-making system that objectively ranks the therapeutic potential of EVs by employing factor normalization, weighted scoring, and multidimensional visualizations. The system has been validated using both the original dataset and synthetic datasets. The original dataset identified PEG 10 %-MFEVs as more effective therapeutically, and the synthetic dataset confirmed ExoOrb's ability for metrisizing EVs across multiple EVs types. To our knowledge, ExoOrb is the first potentially universal framework for evaluating the therapeutic potential of EVs based on characterization parameters, providing a reliable tool for scientific and therapeutic research through standardized, data-driven optimization.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5289-5306"},"PeriodicalIF":4.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12681852/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145707128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSF-CPMP: A novel multi-source feature fusion model for prediction of cyclic peptide membrane permeability. MSF-CPMP:一种预测环肽膜通透性的新型多源特征融合模型。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-19 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.041
Yijun Zhang, Zimeng Chen, Zhuxuan Wan, Qianhui Jiang, Xiaoling Lu, Bin Yan, Jing Qin, Yong Liu, Junwen Wang

Cyclic peptides are becoming attractive molecules for drug discovery because of their properties with inherent stability and structural diversity. However, the high potential of cyclic peptide drugs is challenged by the limited membrane permeability cross cell membrane. To predict cyclic peptide membrane permeability (CPMP), an increased number of computational models or tools are designed and used. But these existing algorithms or models do not appropriately capture feature diversity of cyclic peptides. In this study, we introduce a novel multi-source feature fusion model called MSF-CPMP, which aims to increase the accuracy of predicted CPMP. The MSF-CPMP model incorporates three features extracted from SMILES sequences, graph-based molecular structures, and physicochemical properties of cyclic peptides. By benchmarking with other non-deep-learning and deep learning-based methods, MSF-CPMP achieved the highest levels of the evaluation metrics such as accuracy of 0.9062 and AUROC of 0.9546 and further validated MSF-CPMP robustness in learning capabilities and efficacy of its multi-source fusion. Our result demonstrates that MSF-CPMP outperforms other methods in predicting CPMP, that provides also exemplifies the power of advanced deep learning methods in tackling complex biological challenges, offering contributions to computational biology and clinical treatment. Code is available at https://github.com/wanglabhku/MSF-CPMP.

环肽因其固有的稳定性和结构多样性而成为新药开发的重要分子。然而,环肽类药物的高潜力受到细胞膜通透性有限的挑战。为了预测环肽膜通透性(CPMP),越来越多的计算模型或工具被设计和使用。但是这些现有的算法或模型不能很好地捕捉到环肽的特征多样性。在本研究中,我们引入了一种新的多源特征融合模型MSF-CPMP,旨在提高预测CPMP的准确性。MSF-CPMP模型结合了从SMILES序列中提取的三个特征,基于图的分子结构和环肽的物理化学性质。通过与其他非深度学习和基于深度学习的方法对标,MSF-CPMP在准确率0.9062、AUROC 0.9546等评价指标上达到了最高水平,进一步验证了MSF-CPMP在学习能力和多源融合效果上的稳健性。我们的研究结果表明,MSF-CPMP在预测CPMP方面优于其他方法,这也证明了先进的深度学习方法在解决复杂生物学挑战方面的力量,为计算生物学和临床治疗提供了贡献。代码可从https://github.com/wanglabhku/MSF-CPMP获得。
{"title":"MSF-CPMP: A novel multi-source feature fusion model for prediction of cyclic peptide membrane permeability.","authors":"Yijun Zhang, Zimeng Chen, Zhuxuan Wan, Qianhui Jiang, Xiaoling Lu, Bin Yan, Jing Qin, Yong Liu, Junwen Wang","doi":"10.1016/j.csbj.2025.11.041","DOIUrl":"10.1016/j.csbj.2025.11.041","url":null,"abstract":"<p><p>Cyclic peptides are becoming attractive molecules for drug discovery because of their properties with inherent stability and structural diversity. However, the high potential of cyclic peptide drugs is challenged by the limited membrane permeability cross cell membrane. To predict cyclic peptide membrane permeability (CPMP), an increased number of computational models or tools are designed and used. But these existing algorithms or models do not appropriately capture feature diversity of cyclic peptides. In this study, we introduce a novel multi-source feature fusion model called MSF-CPMP, which aims to increase the accuracy of predicted CPMP. The MSF-CPMP model incorporates three features extracted from SMILES sequences, graph-based molecular structures, and physicochemical properties of cyclic peptides. By benchmarking with other non-deep-learning and deep learning-based methods, MSF-CPMP achieved the highest levels of the evaluation metrics such as accuracy of 0.9062 and AUROC of 0.9546 and further validated MSF-CPMP robustness in learning capabilities and efficacy of its multi-source fusion. Our result demonstrates that MSF-CPMP outperforms other methods in predicting CPMP, that provides also exemplifies the power of advanced deep learning methods in tackling complex biological challenges, offering contributions to computational biology and clinical treatment. Code is available at https://github.com/wanglabhku/MSF-CPMP.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5413-5424"},"PeriodicalIF":4.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12703865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145767393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LC-MS/MS identifies elevated imidazole propionate and gut-derived metabolite alterations in peritoneal dialysis patients. LC-MS/MS鉴别腹膜透析患者咪唑丙酸升高和肠源代谢物改变。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-17 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.039
Weerawan Manokasemsan, Narumol Jariyasopit, Kwanjeera Wanichthanarak, Patcha Poungsombat, Alongkorn Kurilung, Suphitcha Limjiasahapong, Kajol Thapa, Yongyut Sirivatanauksorn, Sukit Raksasuk, Thatsaphan Srithongkul, Chagriya Kitiyakara, Sakda Khoomrung

We developed a robust LC-MS/MS method for the simultaneous quantification of 16 uremic toxins (UTs) and 14 bile acids (BAs) in plasma and fecal samples within a single method. The method demonstrated high sensitivity, broad metabolite coverage, and excellent accuracy, precision, and throughput. Using this platform, targeted metabolites were quantified in peritoneal dialysis (PD) patients (n = 31) and healthy controls (HC; n = 60). Of the 30 targeted metabolites included in the validation method, 20 were detected in fecal samples and 12 in plasma in this study. Fecal samples exhibited greater BA diversity, whereas UTs were evenly distributed across both matrices. Fecal profiles showed minimal differences between PD and HC, suggesting limited gut-level alteration. In contrast, plasma analysis revealed nine metabolites significantly elevated in PD, including indoxyl sulfate, phenyl sulfate, hippuric acid, and imidazole propionate (ImP), lithocholic acid, cinnamoylglycine, m-hydroxyhippuric acid, phenylacetylglutamine, and phenylacetylglycine. Notably, plasma ImP-an underexplored metabolite-was elevated independently of diabetes or cardiovascular disease, implicating impaired renal clearance as its primary driver. These results highlight the systemic impact of gut-derived metabolites in kidney failure and position targeted UT-BA profiling as a powerful complementary tool for clinical metabolomics in chronic kidney disease and PD.

我们建立了一种可靠的LC-MS/MS方法,用于同时定量血浆和粪便样品中的16种尿毒症毒素(UTs)和14种胆汁酸(BAs)。该方法灵敏度高,代谢物覆盖范围广,准确度、精密度和通量高。使用该平台,对腹膜透析(PD)患者(n = 31)和健康对照(n = 60)的靶向代谢物进行量化。在验证方法中包含的30种目标代谢物中,本研究在粪便样本中检测到20种,在血浆中检测到12种。粪便样品显示出更大的BA多样性,而ut均匀分布在两种基质中。粪便特征显示PD和HC之间的差异很小,表明肠道水平的改变有限。相比之下,血浆分析显示PD患者的9种代谢物显著升高,包括硫酸吲哚酚、硫酸苯基、马尿酸和丙酸咪唑(ImP)、石胆酸、肉桂酰甘氨酸、间羟基马尿酸、苯乙酰谷氨酰胺和苯乙酰甘氨酸。值得注意的是,血浆imp(一种未充分开发的代谢物)的升高独立于糖尿病或心血管疾病,暗示肾脏清除受损是其主要驱动因素。这些结果强调了肠道代谢物在肾衰竭中的系统性影响,并将靶向UT-BA谱分析作为慢性肾脏疾病和PD临床代谢组学的有力补充工具。
{"title":"LC-MS/MS identifies elevated imidazole propionate and gut-derived metabolite alterations in peritoneal dialysis patients.","authors":"Weerawan Manokasemsan, Narumol Jariyasopit, Kwanjeera Wanichthanarak, Patcha Poungsombat, Alongkorn Kurilung, Suphitcha Limjiasahapong, Kajol Thapa, Yongyut Sirivatanauksorn, Sukit Raksasuk, Thatsaphan Srithongkul, Chagriya Kitiyakara, Sakda Khoomrung","doi":"10.1016/j.csbj.2025.11.039","DOIUrl":"10.1016/j.csbj.2025.11.039","url":null,"abstract":"<p><p>We developed a robust LC-MS/MS method for the simultaneous quantification of 16 uremic toxins (UTs) and 14 bile acids (BAs) in plasma and fecal samples within a single method. The method demonstrated high sensitivity, broad metabolite coverage, and excellent accuracy, precision, and throughput. Using this platform, targeted metabolites were quantified in peritoneal dialysis (PD) patients (n = 31) and healthy controls (HC; n = 60). Of the 30 targeted metabolites included in the validation method, 20 were detected in fecal samples and 12 in plasma in this study. Fecal samples exhibited greater BA diversity, whereas UTs were evenly distributed across both matrices. Fecal profiles showed minimal differences between PD and HC, suggesting limited gut-level alteration. In contrast, plasma analysis revealed nine metabolites significantly elevated in PD, including indoxyl sulfate, phenyl sulfate, hippuric acid, and imidazole propionate (ImP), lithocholic acid, cinnamoylglycine, m-hydroxyhippuric acid, phenylacetylglutamine, and phenylacetylglycine. Notably, plasma ImP-an underexplored metabolite-was elevated independently of diabetes or cardiovascular disease, implicating impaired renal clearance as its primary driver. These results highlight the systemic impact of gut-derived metabolites in kidney failure and position targeted UT-BA profiling as a powerful complementary tool for clinical metabolomics in chronic kidney disease and PD.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5271-5280"},"PeriodicalIF":4.1,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12681520/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145707355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Slimformer: An NLP-based web server for semantic categorization of gene sets. Slimformer:一个用于基因集语义分类的基于nlp的web服务器。
IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-16 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.11.035
Fionn Daire Keogh, Jonas Marx, Alicia Hiemisch, Rainer Koenig

Omics data analysis often yields extensive lists of genes or enriched gene sets, making it difficult to interpret the underlying cellular mechanisms. Existing gene set categorization methods typically rely on the Gene Ontology hierarchy, neglecting semantic similarity encoded in textual descriptions. We developed Slimformer, an embedding-based Natural Language Processing model that learns contextual relationships between gene sets based on their names, descriptions, and associated genes. A supervised classifier then assigns these embeddings to process categories, trained on a manually curated gold standard. Applied to 2856 annotated gene sets, Slimformer achieved 82.4 % balanced accuracy and an F1-score of 0.867. Applied to gene expression data from human cells infected with Respiratory Syncytial Virus, Slimformer revealed strong downregulation of major cell cycle processes which is highly relevant for the viral pathomechanism, which was overlooked by other tools we tested. By integrating linguistic and functional information, Slimformer enhances the interpretability of omics data and provides a flexible framework for systematic gene set categorization.

组学数据分析通常会产生广泛的基因列表或丰富的基因集,这使得解释潜在的细胞机制变得困难。现有的基因集分类方法通常依赖于基因本体层次,忽略了文本描述中编码的语义相似度。我们开发了Slimformer,这是一个基于嵌入的自然语言处理模型,可以根据基因集的名称、描述和相关基因来学习它们之间的上下文关系。然后,一个监督分类器将这些嵌入分配到过程类别中,在人工策划的黄金标准上进行训练。Slimformer应用于2856个注释基因集,平衡准确率达到82.4 %,f1评分为0.867。应用于呼吸道合胞病毒感染的人类细胞的基因表达数据,Slimformer揭示了与病毒病理机制高度相关的主要细胞周期过程的强下调,这是我们测试的其他工具所忽视的。通过整合语言和功能信息,Slimformer增强了组学数据的可解释性,并为系统的基因集分类提供了一个灵活的框架。
{"title":"Slimformer: An NLP-based web server for semantic categorization of gene sets.","authors":"Fionn Daire Keogh, Jonas Marx, Alicia Hiemisch, Rainer Koenig","doi":"10.1016/j.csbj.2025.11.035","DOIUrl":"10.1016/j.csbj.2025.11.035","url":null,"abstract":"<p><p>Omics data analysis often yields extensive lists of genes or enriched gene sets, making it difficult to interpret the underlying cellular mechanisms. Existing gene set categorization methods typically rely on the Gene Ontology hierarchy, neglecting semantic similarity encoded in textual descriptions. We developed Slimformer, an embedding-based Natural Language Processing model that learns contextual relationships between gene sets based on their names, descriptions, and associated genes. A supervised classifier then assigns these embeddings to process categories, trained on a manually curated gold standard. Applied to 2856 annotated gene sets, Slimformer achieved 82.4 % balanced accuracy and an F1-score of 0.867. Applied to gene expression data from human cells infected with Respiratory Syncytial Virus, Slimformer revealed strong downregulation of major cell cycle processes which is highly relevant for the viral pathomechanism, which was overlooked by other tools we tested. By integrating linguistic and functional information, Slimformer enhances the interpretability of omics data and provides a flexible framework for systematic gene set categorization.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"5252-5262"},"PeriodicalIF":4.1,"publicationDate":"2025-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12671349/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145667639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational and structural biotechnology journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1