首页 > 最新文献

Computational Biology and Chemistry最新文献

英文 中文
Predicting antimicrobial resistance in Staphylococcus aureus using machine learning: Insights from a five-year surveillance study 使用机器学习预测金黄色葡萄球菌的抗菌素耐药性:来自五年监测研究的见解。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-29 DOI: 10.1016/j.compbiolchem.2026.108932
Mohammed F. Aldawsari , Hisham N. Altayb , Ehssan Moglad
Staphylococcus aureus is a leading cause of both community- and hospital-acquired infections, and the growing prevalence of antimicrobial resistance complicates clinical management worldwide. This study investigated the epidemiology, resistance trends, multidrug resistance (MDR) patterns, and the role of machine learning (ML) in predicting antibiotic susceptibility in Saudi Arabia. A total of 18,003 microbiology reports (2019–2024) were analyzed, identifying 2506 S. aureus isolates. Susceptibility testing included 31 antibiotics representing 11 pharmacological classes. Predictive ML models (Random Forest, Logistic Regression, Gradient Boosting) were trained and evaluated using accuracy, precision, recall, F1-score, and confusion matrices. Wound (24 %) and blood (23 %) were the most frequent sources of S. aureus. High resistance (>70 %) was observed for β-lactams, fluoroquinolones, and macrolides/lincosamides, while glycopeptides, oxazolidinones, and lipopeptides maintained excellent activity (<10 % resistance). MDR occurred in 30 % of isolates, XDR in 0.6 %, and no PDR isolates were detected. Among ML models, Random Forest achieved the best overall performance across most antibiotics, Logistic Regression was optimal for ampicillin, and Gradient Boosting for linezolid. Vancomycin, linezolid, penicillin, and SXT achieved precision and recall above 0.92, demonstrating strong predictive reliability. S. aureus remains a major clinical threat in Saudi Arabia, with high MDR rates but preserved efficacy of last-line antibiotics. This study highlights the value of combining multi-center surveillance with interpretable machine learning approaches to support antimicrobial stewardship, enhance early resistance prediction, and inform data-driven clinical decision-making, particularly in settings where rapid molecular diagnostics may be limited.
金黄色葡萄球菌是社区和医院获得性感染的主要原因,而且全球抗菌素耐药性的日益流行使临床管理复杂化。本研究调查了沙特阿拉伯的流行病学、耐药趋势、多药耐药(MDR)模式以及机器学习(ML)在预测抗生素敏感性方面的作用。分析2019-2024年共18003份微生物学报告,鉴定出2506株金黄色葡萄球菌。药敏试验包括31种抗生素,代表11个药理学类别。预测机器学习模型(随机森林、逻辑回归、梯度增强)被训练并使用准确性、精密度、召回率、f1分数和混淆矩阵进行评估。伤口(24% %)和血液(23% %)是金黄色葡萄球菌最常见的来源。β-内酰胺类、氟喹诺酮类和大环内酯类/lincosamides具有较高的耐药性(bbb70 %),而糖肽类、恶唑烷酮类和脂肽类保持了良好的活性(
{"title":"Predicting antimicrobial resistance in Staphylococcus aureus using machine learning: Insights from a five-year surveillance study","authors":"Mohammed F. Aldawsari ,&nbsp;Hisham N. Altayb ,&nbsp;Ehssan Moglad","doi":"10.1016/j.compbiolchem.2026.108932","DOIUrl":"10.1016/j.compbiolchem.2026.108932","url":null,"abstract":"<div><div><em>Staphylococcus aureus</em> is a leading cause of both community- and hospital-acquired infections, and the growing prevalence of antimicrobial resistance complicates clinical management worldwide. This study investigated the epidemiology, resistance trends, multidrug resistance (MDR) patterns, and the role of machine learning (ML) in predicting antibiotic susceptibility in Saudi Arabia. A total of 18,003 microbiology reports (2019–2024) were analyzed, identifying 2506 <em>S. aureus</em> isolates. Susceptibility testing included 31 antibiotics representing 11 pharmacological classes. Predictive ML models (Random Forest, Logistic Regression, Gradient Boosting) were trained and evaluated using accuracy, precision, recall, F1-score, and confusion matrices. Wound (24 %) and blood (23 %) were the most frequent sources of <em>S. aureus</em>. High resistance (&gt;70 %) was observed for β-lactams, fluoroquinolones, and macrolides/lincosamides, while glycopeptides, oxazolidinones, and lipopeptides maintained excellent activity (&lt;10 % resistance). MDR occurred in 30 % of isolates, XDR in 0.6 %, and no PDR isolates were detected. Among ML models, Random Forest achieved the best overall performance across most antibiotics, Logistic Regression was optimal for ampicillin, and Gradient Boosting for linezolid. Vancomycin, linezolid, penicillin, and SXT achieved precision and recall above 0.92, demonstrating strong predictive reliability. <em>S. aureus</em> remains a major clinical threat in Saudi Arabia, with high MDR rates but preserved efficacy of last-line antibiotics. This study highlights the value of combining multi-center surveillance with interpretable machine learning approaches to support antimicrobial stewardship, enhance early resistance prediction, and inform data-driven clinical decision-making, particularly in settings where rapid molecular diagnostics may be limited.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108932"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pivot gene enrichment analysis of Streptococcus pyogenes specific hyaluronic acid mediated disease prognosis on gastric cancer: Based on bioinformatics study 化脓性链球菌特异性透明质酸介导胃癌疾病预后的Pivot基因富集分析:基于生物信息学研究。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-02-06 DOI: 10.1016/j.compbiolchem.2026.108928
Debaleena Samanta, Malavika Bhattacharya

Background

Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism Streptococcus pyogenes which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.

Methods

GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of Streptococcus pyogenes MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.

Results

Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). Streptococcus pyogenes mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and p-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.
背景:肠道生态系统是通过肠道微生物群的免疫调节来维持的,肠道微生物群导致了胃癌等炎症性疾病。透明质酸来源于肠道微生物化脓性链球菌,它直接控制促进或抑制胃癌的潜在基因组的上下调节。方法:利用GEO数据库,观察与透明质酸介导的胃癌相关的潜在中枢基因。基因表达分析和PPI网络分析分别通过DAVID软件下的EMBL-EBI和STRING数据库进行。通过Reactome数据源研究基因相互作用,通过GeneMANIA在线服务器识别基因网络。使用BIOVENN生成维恩图,使用GSEA生成热图。通过MiST网站识别微生物信号转导,通过RegPrecise和MetaCyc网站分析调控子和转录因子,进行生物合成途径分析。TCGA被纳入研究癌症基因组学和基因相互作用途径。KEGG通路富集是通过ShinyGO资源完成的。KM-Survival Plots是通过CybersortX绘制的。基因组表达分析由GEPIA门户网站完成。通过CARD和BV-BRC数据库对化脓性链球菌MGAS的抗性体和变异分离及其副产物进行了研究。配体-药物分析和TCGA药物反应和生存分析通过mule和GEPIA 3网络资源纳入。结果:差异表达分析鉴定出HMMR基因相关的上调和下调基因。Venn分析解释了HMMR、IL1B和HAS3基因中共表达的3个基因。HMMR基因的全球癌症热图显示高表达水平,强度值为0.50204至最低表达值为-0.58367。与HMMR基因相关的细胞反应是由Chk1/Chk2 (Cds1)介导的细胞周期蛋白B (Cdk1)复合物失活导致的程序性细胞死亡的原因。还鉴定了化脓性链球菌介导的HMMR蛋白的生物学途径、转录因子、调控因子和基因组分析。KEGG富集分析显示NF-kB信号通路具有透明质酸介导的网络基因集。km -生存分析通过风险比(HR)和p值识别来描述。配体分子透明质酸与药物5-氟尿嘧啶和表柔比星的药物靶标对接分析和TCGA药物生存分析和反应涉及治疗干预。
{"title":"Pivot gene enrichment analysis of Streptococcus pyogenes specific hyaluronic acid mediated disease prognosis on gastric cancer: Based on bioinformatics study","authors":"Debaleena Samanta,&nbsp;Malavika Bhattacharya","doi":"10.1016/j.compbiolchem.2026.108928","DOIUrl":"10.1016/j.compbiolchem.2026.108928","url":null,"abstract":"<div><h3>Background</h3><div>Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism <em>Streptococcus pyogenes</em> which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.</div></div><div><h3>Methods</h3><div>GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of <em>Streptococcus pyogenes</em> MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.</div></div><div><h3>Results</h3><div>Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). <em>Streptococcus pyogenes</em> mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and <em>p</em>-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108928"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational screening of natural plant and marine compounds as potential inhibitors of Mycobacterium tuberculosis dihydrodipicolinate synthase 天然植物和海洋化合物作为结核分枝杆菌二氢二吡啶合酶潜在抑制剂的计算筛选。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-31 DOI: 10.1016/j.compbiolchem.2026.108936
Swati Meena , Firdaus Fatima , Faizan Abul Qais , Srinivasan Ramachandran
The escalating threat of antimicrobial resistance (AMR) in Mycobacterium tuberculosis highlights the urgent need for innovative therapeutic strategies through searching for novel inhibitor molecules. Plant and marine habitats are rich reservoirs of bioactive compounds. Targeting critical pathways for M. tuberculosis survival, such as cell wall biosynthesis, offers a promising approach for drug discovery. Diaminopimelate, a critical component of the bacterial cell wall, is synthesized through the lysine biosynthesis pathway. Dihydrodipicolinate synthase (Mtb-DapA) is a promising drug target in this pathway. We screened antitubercular natural and marine-derived compounds against Mtb-DapA using molecular docking, molecular dynamics (MD) simulations, Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA) analysis was performed to estimate binding free energies and identify promising inhibitors. In this study, we analysed 633 phytochemicals from the BioPhytMol Database, 210 anti-TB phytochemicals from recent literature, and 406 marine habitat-derived anti-TB compounds. We report top three inhibitors glycyrrhizin, micromeline and lico-isoflavone based on MD simulations and MM-PBSA analysis. Binding free energies of glycyrrhizin, micromeline and lico-isoflavone were −50.39 kcal/mol, −17.90 kcal/mol, −17.88 kcal/mol respectively as revealed through MM-PBSA analysis. Glycyrrhizin emerged as the most potent inhibitor. These findings underscore the therapeutic potential of glycyrrhizin, micromeline and lico-isoflavone as promising candidates for further development thereby offering hope for alternative treatments against M. tuberculosis.
结核分枝杆菌抗微生物药物耐药性(AMR)的威胁日益加剧,迫切需要通过寻找新的抑制剂分子来创新治疗策略。植物和海洋栖息地是生物活性化合物的丰富储存库。靶向结核分枝杆菌生存的关键途径,如细胞壁生物合成,为药物发现提供了一种有希望的方法。二氨基苯甲酸酯是细菌细胞壁的重要组成部分,通过赖氨酸生物合成途径合成。二氢二吡啶酸合成酶(Mtb-DapA)是该途径中一个很有前景的药物靶点。我们通过分子对接、分子动力学(MD)模拟、分子力学泊松-玻尔兹曼表面积(MM-PBSA)分析来估计结合自由能并确定有希望的抑制剂,筛选抗结核天然和海洋来源的Mtb-DapA化合物。在这项研究中,我们分析了来自BioPhytMol数据库的633种植物化学物质,来自最新文献的210种抗结核植物化学物质,以及406种来自海洋栖息地的抗结核化合物。基于MD模拟和MM-PBSA分析,我们报道了前三名的抑制剂甘草酸苷、微碱和甘草异黄酮。MM-PBSA分析结果显示,甘草酸、微量线和甘草异黄酮的结合自由能分别为-50.39 kcal/mol、-17.90 kcal/mol和-17.88 kcal/mol。甘草酸是最有效的抑制剂。这些发现强调了甘草酸、微碱和甘草异黄酮作为进一步开发的有希望的候选药物的治疗潜力,从而为替代治疗结核分枝杆菌提供了希望。
{"title":"Computational screening of natural plant and marine compounds as potential inhibitors of Mycobacterium tuberculosis dihydrodipicolinate synthase","authors":"Swati Meena ,&nbsp;Firdaus Fatima ,&nbsp;Faizan Abul Qais ,&nbsp;Srinivasan Ramachandran","doi":"10.1016/j.compbiolchem.2026.108936","DOIUrl":"10.1016/j.compbiolchem.2026.108936","url":null,"abstract":"<div><div>The escalating threat of antimicrobial resistance (AMR) in <em>Mycobacterium tuberculosis</em> highlights the urgent need for innovative therapeutic strategies through searching for novel inhibitor molecules. Plant and marine habitats are rich reservoirs of bioactive compounds. Targeting critical pathways for <em>M. tuberculosis</em> survival, such as cell wall biosynthesis, offers a promising approach for drug discovery. Diaminopimelate, a critical component of the bacterial cell wall, is synthesized through the lysine biosynthesis pathway. Dihydrodipicolinate synthase (Mtb-DapA) is a promising drug target in this pathway. We screened antitubercular natural and marine-derived compounds against Mtb-DapA using molecular docking, molecular dynamics (MD) simulations, Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA) analysis was performed to estimate binding free energies and identify promising inhibitors. In this study, we analysed 633 phytochemicals from the BioPhytMol Database, 210 anti-TB phytochemicals from recent literature, and 406 marine habitat-derived anti-TB compounds. We report top three inhibitors glycyrrhizin, micromeline and lico-isoflavone based on MD simulations and MM-PBSA analysis. Binding free energies of glycyrrhizin, micromeline and lico-isoflavone were −50.39 kcal/mol, −17.90 kcal/mol, −17.88 kcal/mol respectively as revealed through MM-PBSA analysis. Glycyrrhizin emerged as the most potent inhibitor. These findings underscore the therapeutic potential of glycyrrhizin, micromeline and lico-isoflavone as promising candidates for further development thereby offering hope for alternative treatments against <em>M. tuberculosis</em>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108936"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring genomic data with prefix-free parsing 使用无前缀解析测量基因组数据。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2025-12-30 DOI: 10.1016/j.compbiolchem.2025.108870
Simone Lucà , Francesco Masillo , Zsuzsanna Lipták

Summary:

Prefix-free parsing (Boucher et al., 2019) is a highly effective heuristic for computing text indexes for very large amounts of biological data. The algorithm constructs a data structure, the prefix-free parse (PFP) of the input, consisting of a dictionary and a parse, which is then used to speed up computation of the final index. In this paper, we study the size of the PFP, which we refer to as π, and show that it is a powerful tool in its own right. To show this, we present two use cases. We first study the application of π as a repetitiveness measure of the input text, and compare it to other currently used repetitiveness measures, including z (the number of Lempel–Ziv phrases), r (the number of runs of the Burrows–Wheeler Transform), and δ (the text’s substring complexity). We then turn to the use of π as a measure for pangenome openness. In both applications, our results are similar to existing measures, but our tool, in almost all cases, is more efficient than those computing the other measures, both in terms of time and space, sometimes by orders of magnitude. We close the paper with a detailed systematic study of the parameter choice for PFP (window size w and modulus p). This gives rise to interesting open questions.

Availability and implementation:

The source code is available at https://github.com/simolucaa/piPFP. The accession codes for all the datasets used and the raw results are available at https://github.com/simolucaa/piPFP_experiments.
无前缀解析(Boucher et al., 2019)是一种非常有效的启发式算法,用于计算大量生物数据的文本索引。该算法构建了一个数据结构,即输入的无前缀解析(PFP),由字典和解析组成,然后用于加快最终索引的计算。在本文中,我们研究了PFP的大小,我们称之为π,并表明它本身就是一个强大的工具。为了说明这一点,我们给出两个用例。我们首先研究了π作为输入文本的重复度量的应用,并将其与目前使用的其他重复度量进行比较,包括z (Lempel-Ziv短语的数量),r (Burrows-Wheeler变换的运行次数)和δ(文本的子字符串复杂度)。然后,我们转向使用π作为泛基因组开放性的度量。在这两个应用程序中,我们的结果与现有的度量相似,但是我们的工具,在几乎所有情况下,比计算其他度量更有效,无论是在时间和空间方面,有时是数量级。最后,我们详细系统地研究了PFP(窗口大小w和模量p)的参数选择。这就产生了有趣的开放性问题。可用性和实现:源代码可从https://github.com/simolucaa/piPFP获得。使用的所有数据集和原始结果的加入代码可在https://github.com/simolucaa/piPFP_experiments获得。
{"title":"Measuring genomic data with prefix-free parsing","authors":"Simone Lucà ,&nbsp;Francesco Masillo ,&nbsp;Zsuzsanna Lipták","doi":"10.1016/j.compbiolchem.2025.108870","DOIUrl":"10.1016/j.compbiolchem.2025.108870","url":null,"abstract":"<div><h3>Summary:</h3><div>Prefix-free parsing (Boucher et al., 2019) is a highly effective heuristic for computing text indexes for very large amounts of biological data. The algorithm constructs a data structure, the prefix-free parse (PFP) of the input, consisting of a dictionary and a parse, which is then used to speed up computation of the final index. In this paper, we study the <em>size</em> of the PFP, which we refer to as <span><math><mi>π</mi></math></span>, and show that it is a powerful tool in its own right. To show this, we present two use cases. We first study the application of <span><math><mi>π</mi></math></span> as a <em>repetitiveness measure</em> of the input text, and compare it to other currently used repetitiveness measures, including <span><math><mi>z</mi></math></span> (the number of Lempel–Ziv phrases), <span><math><mi>r</mi></math></span> (the number of runs of the Burrows–Wheeler Transform), and <span><math><mi>δ</mi></math></span> (the text’s substring complexity). We then turn to the use of <span><math><mi>π</mi></math></span> as a measure for <em>pangenome openness</em>. In both applications, our results are similar to existing measures, but our tool, in almost all cases, is more efficient than those computing the other measures, both in terms of time and space, sometimes by orders of magnitude. We close the paper with a detailed systematic study of the parameter choice for PFP (window size <span><math><mi>w</mi></math></span> and modulus <span><math><mi>p</mi></math></span>). This gives rise to interesting open questions.</div></div><div><h3>Availability and implementation:</h3><div>The source code is available at <span><span>https://github.com/simolucaa/piPFP</span><svg><path></path></svg></span>. The accession codes for all the datasets used and the raw results are available at <span><span>https://github.com/simolucaa/piPFP_experiments</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108870"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-layer hypergraph framework for drug-drug interaction prediction based on transformer and hypergraph convolution 一种基于变形和超图卷积的多层药物-药物相互作用预测超图框架
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-08 DOI: 10.1016/j.compbiolchem.2026.108880
Lu Shen , Feng Hu , Libing Bai
Drug-drug interactions represent a key problem for drug research, development, and clinical practice. It is crucial to accurately predict interactions when drugs combine to improve treatment safety and optimize medication regimens. However, the exponential increase in potential drug combinations, along with the limitations of conventional graph and multi-layer network prediction models— which primarily capture only binary relationships between drugs and struggle to represent multi-element synergistic interactions—limits prediction performance. To overcome these challenges, this paper proposes a Multi-Layer Hypergraph framework for drug interaction prediction using Transformer and Hypergraph Convolution (MLHTHC). This framework first constructs a multi-layer similarity hypergraph of drugs based on four attribute types: chemical structure, ATC code, drug category, and corresponding targets. Using drug-drug interaction data from KEGG database as a benchmark, the spectral Hamming similarity method is adopted to calculate the structural similarity between the constructed hypergraph and the benchmark hypergraph, enabling the determination of the importance weight for each hypergraph layer. Subsequently, a hypergraph convolutional neural network performs network embedding on each layer of drug nodes; the Transformer model is used to weight and fuse the multi-layer features; and finally, The multi-layer perceptron (MLP) is used to predict drug-drug interactions (DDIs).Experimental results demonstrate that this model outperforms existing methods such as DPSP and DANN, with the integration of Transformer and hypergraph convolution significantly enhancing prediction accuracy. This approach provides an effective tool for drug-drug interaction prediction.
药物-药物相互作用是药物研究、开发和临床实践的关键问题。当药物联合使用时,准确预测相互作用对提高治疗安全性和优化用药方案至关重要。然而,潜在药物组合的指数增长,以及传统的图和多层网络预测模型的局限性——它们主要只捕获药物之间的二元关系,难以表示多元素协同相互作用——限制了预测性能。为了克服这些挑战,本文提出了一种基于Transformer和Hypergraph Convolution (MLHTHC)的药物相互作用多层超图框架。该框架首先基于化学结构、ATC代码、药物类别和相应靶点四种属性类型构建药物的多层相似超图。以KEGG数据库中的药物-药物相互作用数据为基准,采用谱Hamming相似度法计算构建的超图与基准超图之间的结构相似度,从而确定各超图层的重要权重。随后,超图卷积神经网络在每一层药物节点上进行网络嵌入;使用Transformer模型对多层特征进行加权和融合;最后,利用多层感知器(MLP)预测药物-药物相互作用(ddi)。实验结果表明,该模型优于现有的DPSP和DANN方法,变压器和超图卷积的结合显著提高了预测精度。该方法为药物-药物相互作用预测提供了有效的工具。
{"title":"A multi-layer hypergraph framework for drug-drug interaction prediction based on transformer and hypergraph convolution","authors":"Lu Shen ,&nbsp;Feng Hu ,&nbsp;Libing Bai","doi":"10.1016/j.compbiolchem.2026.108880","DOIUrl":"10.1016/j.compbiolchem.2026.108880","url":null,"abstract":"<div><div>Drug-drug interactions represent a key problem for drug research, development, and clinical practice. It is crucial to accurately predict interactions when drugs combine to improve treatment safety and optimize medication regimens. However, the exponential increase in potential drug combinations, along with the limitations of conventional graph and multi-layer network prediction models— which primarily capture only binary relationships between drugs and struggle to represent multi-element synergistic interactions—limits prediction performance. To overcome these challenges, this paper proposes a Multi-Layer Hypergraph framework for drug interaction prediction using Transformer and Hypergraph Convolution (MLHTHC). This framework first constructs a multi-layer similarity hypergraph of drugs based on four attribute types: chemical structure, ATC code, drug category, and corresponding targets. Using drug-drug interaction data from KEGG database as a benchmark, the spectral Hamming similarity method is adopted to calculate the structural similarity between the constructed hypergraph and the benchmark hypergraph, enabling the determination of the importance weight for each hypergraph layer. Subsequently, a hypergraph convolutional neural network performs network embedding on each layer of drug nodes; the Transformer model is used to weight and fuse the multi-layer features; and finally, The multi-layer perceptron (MLP) is used to predict drug-drug interactions (DDIs).Experimental results demonstrate that this model outperforms existing methods such as DPSP and DANN, with the integration of Transformer and hypergraph convolution significantly enhancing prediction accuracy. This approach provides an effective tool for drug-drug interaction prediction.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108880"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence of drying temperature on the metabolites profile and potential antioxidant pathways of Passiflora edulis peel: Integrating untargeted metabolomics with network pharmacology analyses, molecular docking, and molecular dynamics simulation 干燥温度对西番莲果皮代谢物谱和潜在抗氧化途径的影响:非靶向代谢组学与网络药理学分析、分子对接和分子动力学模拟的结合
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-17 DOI: 10.1016/j.compbiolchem.2026.108910
Mohamad Norisham Mohamad Rosdi , Mohd Nurhafizam Karuning , Nur Hanisah Azmi , Mohamad Hafizi Abu Bakar , Yanty Noorziana Abdul Manaf , Feri Eko Hermanto , Aniza Saini , Mohd Azrie Awang , Zainul Amiruddin Zakaria
Passiflora edulis peels consist of considerable antioxidative potential, which attributed to their diverse bioactive components. Nevertheless, these substances are susceptible to thermal degradation which can diminish their usefulness, resulting in resource wastage. This current research explore the influence of drying under varying temperature conditions (room temperature (∼28 °C), 40°C, and 70°C) on the antioxidant properties and metabolite composition of P. edulis peel extracts. A comprehensive analytical approach was adopted, encompassing proximate analysis, vitamin C quantification, total phenolic and flavonoid determinations, free radical scavenging assays, metabolite profiling, network pharmacology, molecular docking, and molecular dynamics simulation. In this study, the content of crude fibre and primary metabolites including fat, protein and carbohydrate were shown to be affected by the elevating drying temperature. Likewise, extract of P. edulis peels dried at room temperature established significant antioxidant activity at 1 mg/mL, inhibiting 2,2-diphenyl-1-picrylhydrazyl radicals (DPPH•) by 81.20 % and 2,2’-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) radicals (ABTS⁺•) by 83.52 %. The content of secondary metabolites such as phenolics and flavonoids was also shown to be affected by temperature, which peels dried at room temperature harbour substantial phenolics and flavonoids content values, 23.71 ± 3.86 mg GAE/g and 35.43 ± 0.10 mg QE/g. The results from metabolite profiling analysis via LC-MS QTOF discovered that the room temperature extract contains 18 potential compounds, including oleamide, 6E,9E-octadecadienoic acid, C16 sphinganine, dodecanamide, and 2-hexyl-decanoic acid. Swiss Target Prediction was employed to identify hypothetical molecular targets, while oxidative stress-related targets were retrieved from the DrugBank, GeneCards, and DisGENET databases. A component–target-pathway network was constructed, encompassing 12 bioactive compounds after initial ADMET screening and 10 hub genes namely TP53, AKT1, CASP3, BCL2, STAT3, HSP90AA1, HSP90AB1, BCL2L1, ESR1, and MDM2. The identified potential antioxidant-related pathways included intrinsic apoptotic signalling, mitochondrial membrane organisation, and mitochondrial transport, among others. Structure-based virtual screening through molecular docking revealed that (S)-2-Hydroxy-2-phenylacetonitrile O-b-D-allopyranoside exhibited significant interaction with HSP90AB1, resulting in a binding affinity of −8.4 kcal/mol. These findings reinforce the pharmacological relevance of P. edulis peels as a high-value reservoir of potential antioxidant substances suitable for the development of functional foods and drugs for disease prevention and health promotion.
西番莲果皮具有丰富的生物活性成分,具有很强的抗氧化活性。然而,这些物质容易受到热降解的影响,从而降低其用途,造成资源浪费。本研究探讨了在不同温度条件下(室温(~ 28°C)、40°C和70°C)干燥对竹皮提取物的抗氧化性能和代谢物组成的影响。采用了综合分析方法,包括近似分析、维生素C定量、总酚和类黄酮测定、自由基清除测定、代谢物谱分析、网络药理学、分子对接和分子动力学模拟。在本研究中,干燥温度的升高对粗纤维和主要代谢物包括脂肪、蛋白质和碳水化合物的含量有影响。同样,在室温下干燥的竹皮提取物在1 mg/mL时具有显著的抗氧化活性,抑制2,2-二苯基-1-吡啶肼基自由基(DPPH•)的速率为81.20 %,抑制2,2 ' -氮基-双(3-乙基苯并噻唑-6-磺酸)自由基(ABTS⁺•)的速率为83.52 %。次生代谢产物如酚类物质和黄酮类物质的含量也受温度的影响,室温干燥的果皮含有大量的酚类物质和黄酮类物质含量,分别为23.71 ± 3.86 mg GAE/g和35.43 ± 0.10 mg QE/g。LC-MS QTOF代谢物谱分析结果发现,室温提取液中含有18种潜在化合物,包括油酰胺、6E、9e -十八烯二烯酸、C16鞘氨酸、十二烯酰胺和2-己基癸酸。使用Swiss Target Prediction来确定假设的分子靶标,同时从DrugBank、GeneCards和DisGENET数据库检索氧化应激相关靶标。通过ADMET初步筛选,构建了包含12种生物活性化合物和10个枢纽基因(TP53、AKT1、CASP3、BCL2、STAT3、HSP90AA1、HSP90AB1、BCL2L1、ESR1、MDM2)的组分-靶标-通路网络。已确定的潜在抗氧化相关途径包括内在凋亡信号,线粒体膜组织和线粒体运输等。通过分子对接进行基于结构的虚拟筛选,发现(S)-2-羟基-2-苯乙腈O-b-D-allopyranoside与HSP90AB1具有显著的相互作用,结合亲和力为−8.4 kcal/mol。这些发现加强了竹竹皮作为潜在抗氧化物质的高价值储存库的药理相关性,适合开发用于预防疾病和促进健康的功能食品和药物。
{"title":"Influence of drying temperature on the metabolites profile and potential antioxidant pathways of Passiflora edulis peel: Integrating untargeted metabolomics with network pharmacology analyses, molecular docking, and molecular dynamics simulation","authors":"Mohamad Norisham Mohamad Rosdi ,&nbsp;Mohd Nurhafizam Karuning ,&nbsp;Nur Hanisah Azmi ,&nbsp;Mohamad Hafizi Abu Bakar ,&nbsp;Yanty Noorziana Abdul Manaf ,&nbsp;Feri Eko Hermanto ,&nbsp;Aniza Saini ,&nbsp;Mohd Azrie Awang ,&nbsp;Zainul Amiruddin Zakaria","doi":"10.1016/j.compbiolchem.2026.108910","DOIUrl":"10.1016/j.compbiolchem.2026.108910","url":null,"abstract":"<div><div><em>Passiflora edulis</em> peels consist of considerable antioxidative potential, which attributed to their diverse bioactive components. Nevertheless, these substances are susceptible to thermal degradation which can diminish their usefulness, resulting in resource wastage. This current research explore the influence of drying under varying temperature conditions (room temperature (∼28 °C), 40°C, and 70°C) on the antioxidant properties and metabolite composition of <em>P. edulis</em> peel extracts. A comprehensive analytical approach was adopted, encompassing proximate analysis, vitamin C quantification, total phenolic and flavonoid determinations, free radical scavenging assays, metabolite profiling, network pharmacology, molecular docking, and molecular dynamics simulation. In this study, the content of crude fibre and primary metabolites including fat, protein and carbohydrate were shown to be affected by the elevating drying temperature. Likewise, extract of <em>P. edulis</em> peels dried at room temperature established significant antioxidant activity at 1 mg/mL, inhibiting 2,2-diphenyl-1-picrylhydrazyl radicals (DPPH•) by 81.20 % and 2,2’-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) radicals (ABTS⁺•) by 83.52 %. The content of secondary metabolites such as phenolics and flavonoids was also shown to be affected by temperature, which peels dried at room temperature harbour substantial phenolics and flavonoids content values, 23.71 ± 3.86 mg GAE/g and 35.43 ± 0.10 mg QE/g. The results from metabolite profiling analysis via LC-MS QTOF discovered that the room temperature extract contains 18 potential compounds, including oleamide, 6E,9E-octadecadienoic acid, C16 sphinganine, dodecanamide, and 2-hexyl-decanoic acid. Swiss Target Prediction was employed to identify hypothetical molecular targets, while oxidative stress-related targets were retrieved from the DrugBank, GeneCards, and DisGENET databases. A component–target-pathway network was constructed, encompassing 12 bioactive compounds after initial ADMET screening and 10 hub genes namely TP53, AKT1, CASP3, BCL2, STAT3, HSP90AA1, HSP90AB1, BCL2L1, ESR1, and MDM2. The identified potential antioxidant-related pathways included intrinsic apoptotic signalling, mitochondrial membrane organisation, and mitochondrial transport, among others. Structure-based virtual screening through molecular docking revealed that (S)-2-Hydroxy-2-phenylacetonitrile O-b-<span>D</span>-allopyranoside exhibited significant interaction with HSP90AB1, resulting in a binding affinity of −8.4 kcal/mol. These findings reinforce the pharmacological relevance of <em>P. edulis</em> peels as a high-value reservoir of potential antioxidant substances suitable for the development of functional foods and drugs for disease prevention and health promotion.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108910"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146035278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CCTAD: A topologically associating domains detection method integrating convolutional autoencoder and hierarchical clustering 结合卷积自编码器和分层聚类的拓扑关联域检测方法
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-20 DOI: 10.1016/j.compbiolchem.2026.108913
Feng Ruiping , Luo Junwei , Liu Kaihua , Guo Fei

Background

More accurate identification of topologically associating domains (TADs) is crucial for understanding chromatin spatial organization and gene regulation. Although many computational methods have been developed for TAD detection, existing approaches still exhibit several key limitations, such as difficulty in simultaneously modeling local and global interaction patterns, reliance on manually tuned parameters that lead to unstable boundary granularity, and high sensitivity to resolution and cell type. These issues result in unstable boundaries and high sensitivity to noise, thereby affecting the overall biological interpretability of TAD structures.

Results

To address these challenges, we propose a novel unsupervised TAD identification method, CCTAD, which for the first time integrates a one-dimensional convolutional autoencoder (1D-CAE) with connectivity-constrained hierarchical clustering. The 1D-CAE automatically extracts high-quality, low-dimensional feature representations from Hi-C contact matrices in an unsupervised manner, effectively capturing both local and global patterns of chromatin interactions. These learned features are then partitioned using a hierarchical clustering strategy augmented with genomic adjacency constraints, which simultaneously preserves the continuity of TADs along the linear genome and adaptively optimizes clustering granularity. This design overcomes the limitations of traditional methods in terms of boundary precision and stability. As a result, CCTAD produces TAD delineations that are more biologically meaningful and robust across different resolutions. The source code is available on GitHub at https://github.com/ruiping-Feng/CCTAD.

Conclusions

Evaluation of CCTAD across multiple cell lines and resolutions demonstrates its advantages in boundary identification of key regulatory elements.
更准确地识别拓扑相关结构域(TADs)对于理解染色质空间组织和基因调控至关重要。尽管已经开发了许多用于TAD检测的计算方法,但现有方法仍然存在一些关键的局限性,例如难以同时建模局部和全局相互作用模式,依赖于手动调整的参数,导致边界粒度不稳定,以及对分辨率和细胞类型的高灵敏度。这些问题导致边界不稳定和对噪声的高敏感性,从而影响TAD结构的整体生物学可解释性。为了解决这些挑战,我们提出了一种新的无监督TAD识别方法CCTAD,该方法首次将一维卷积自编码器(1D-CAE)与连接约束的分层聚类集成在一起。1D-CAE以无监督的方式自动从Hi-C接触矩阵中提取高质量的低维特征表示,有效地捕获染色质相互作用的局部和全局模式。然后使用增强了基因组邻接约束的分层聚类策略对这些学习到的特征进行分割,该策略同时保持了TADs沿线性基因组的连续性并自适应优化聚类粒度。该设计克服了传统方法在边界精度和稳定性方面的局限性。因此,CCTAD产生的TAD描述在不同分辨率下更具生物学意义和健壮性。源代码可在GitHub上获得https://github.com/ruiping-Feng/CCTAD.ConclusionsEvaluation的CCTAD跨多个细胞系和分辨率证明了其在关键调控元件的边界识别方面的优势。
{"title":"CCTAD: A topologically associating domains detection method integrating convolutional autoencoder and hierarchical clustering","authors":"Feng Ruiping ,&nbsp;Luo Junwei ,&nbsp;Liu Kaihua ,&nbsp;Guo Fei","doi":"10.1016/j.compbiolchem.2026.108913","DOIUrl":"10.1016/j.compbiolchem.2026.108913","url":null,"abstract":"<div><h3>Background</h3><div>More accurate identification of topologically associating domains (TADs) is crucial for understanding chromatin spatial organization and gene regulation. Although many computational methods have been developed for TAD detection, existing approaches still exhibit several key limitations, such as difficulty in simultaneously modeling local and global interaction patterns, reliance on manually tuned parameters that lead to unstable boundary granularity, and high sensitivity to resolution and cell type. These issues result in unstable boundaries and high sensitivity to noise, thereby affecting the overall biological interpretability of TAD structures.</div></div><div><h3>Results</h3><div>To address these challenges, we propose a novel unsupervised TAD identification method, CCTAD, which for the first time integrates a one-dimensional convolutional autoencoder (1D-CAE) with connectivity-constrained hierarchical clustering. The 1D-CAE automatically extracts high-quality, low-dimensional feature representations from Hi-C contact matrices in an unsupervised manner, effectively capturing both local and global patterns of chromatin interactions. These learned features are then partitioned using a hierarchical clustering strategy augmented with genomic adjacency constraints, which simultaneously preserves the continuity of TADs along the linear genome and adaptively optimizes clustering granularity. This design overcomes the limitations of traditional methods in terms of boundary precision and stability. As a result, CCTAD produces TAD delineations that are more biologically meaningful and robust across different resolutions. The source code is available on GitHub at <span><span>https://github.com/ruiping-Feng/CCTAD</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusions</h3><div>Evaluation of CCTAD across multiple cell lines and resolutions demonstrates its advantages in boundary identification of key regulatory elements.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108913"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146034877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular modelling, docking, and MD simulation of bacterial lipase: Binding interaction investigation against triglycerides 细菌脂肪酶的分子建模、对接和MD模拟:针对甘油三酯的结合相互作用研究
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-11 DOI: 10.1016/j.compbiolchem.2026.108899
Titin Haryati , Norman Yoshi Haryono , Bernardinus Parlindungan Atmaka , Fira Alisya Nur Azizah , Akhmaloka , Muhammad Irfan
Bacterial lipase has thermostability and solvent stability making it suitable for development as a biodiesel catalyst. Biodiesel industry relies on triglycerides as the main substrate. Therefore, to find the best bacterial lipase activity towards triglycerides substrate, efficient screening activity methods can be used throughout the silico study. Bacterial lipases used in this investigation are Pseudomonas aeruginosa lipase, Burkholderia cepacia lipase, Serratia marcescens lipase, and Bacillus pumilus lipase. UniProtKB was used to retrieve these four bacterial lipases, which were then modeled in three dimensions using homology methods using Alphafold2. Those four bacterial lipase were docking against triglycerides substrates, such as glyceryl tridecanoate, glyceryl trilaurate, glyceryl trimyristate, glyceryl tripalmitate, glyceryl tristearate, glyceryl trioleate, and glyceryl trilinoleate. Autodock Vina was utilized to conduct a docking investigation. According to the docking studies, all bacterial lipase had the highest affinity for glyceryl tristearate. To study the stability of binding interaction between bacterial lipase and triglycerides, we run a molecular dynamics simulation based on AMBER. Based on RMSD, RMSF, catalytical distance measurements, and Rgyration analysis data, it was determined that Burkholderia cepacia lipase-glyceryl trioleate and Serratia marcescens lipase-glyceryl trioleate are the most stable interactions. In the future, the insights obtained in this study can be referenced to choose the best candidates for bacterial lipase towards triglycerides substrates and develop engineered lipases to enhance biocatalysis performance.
细菌脂肪酶具有热稳定性和溶剂稳定性,适合作为生物柴油催化剂开发。生物柴油工业依赖甘油三酯作为主要底物。因此,为了找到对甘油三酯底物的最佳细菌脂肪酶活性,可以在整个硅研究中使用有效的筛选活性方法。本研究中使用的细菌脂肪酶有铜绿假单胞菌脂肪酶、洋葱伯克氏菌脂肪酶、粘质沙雷氏菌脂肪酶和短小芽孢杆菌脂肪酶。使用UniProtKB检索这四种细菌脂肪酶,然后使用Alphafold2使用同源性方法在三维上建模。这四种细菌脂肪酶与甘油三酯底物对接,如三酸甘油酯、三酸甘油三酯、三酸甘油三酯、三棕榈酸甘油三酯、三硬脂酸甘油三酯。利用Autodock Vina进行对接调查。根据对接研究,所有细菌脂肪酶对三硬脂酸甘油具有最高的亲和力。为了研究细菌脂肪酶与甘油三酯结合相互作用的稳定性,我们基于AMBER进行了分子动力学模拟。基于RMSD、RMSF、催化距离测量和Rgyration分析数据,确定洋葱伯克氏菌脂肪酶-三油酸甘油酯和粘质沙雷氏菌脂肪酶-三油酸甘油酯是最稳定的相互作用。在未来,本研究获得的见解可用于选择细菌脂肪酶对甘油三酯底物的最佳候选物,并开发工程脂肪酶以提高生物催化性能。
{"title":"Molecular modelling, docking, and MD simulation of bacterial lipase: Binding interaction investigation against triglycerides","authors":"Titin Haryati ,&nbsp;Norman Yoshi Haryono ,&nbsp;Bernardinus Parlindungan Atmaka ,&nbsp;Fira Alisya Nur Azizah ,&nbsp;Akhmaloka ,&nbsp;Muhammad Irfan","doi":"10.1016/j.compbiolchem.2026.108899","DOIUrl":"10.1016/j.compbiolchem.2026.108899","url":null,"abstract":"<div><div>Bacterial lipase has thermostability and solvent stability making it suitable for development as a biodiesel catalyst. Biodiesel industry relies on triglycerides as the main substrate. Therefore, to find the best bacterial lipase activity towards triglycerides substrate, efficient screening activity methods can be used throughout the silico study. Bacterial lipases used in this investigation are <em>Pseudomonas aeruginosa</em> lipase, <em>Burkholderia cepacia</em> lipase, <em>Serratia marcescens</em> lipase, and <em>Bacillus pumilus</em> lipase. UniProtKB was used to retrieve these four bacterial lipases, which were then modeled in three dimensions using homology methods using Alphafold2. Those four bacterial lipase were docking against triglycerides substrates, such as glyceryl tridecanoate, glyceryl trilaurate, glyceryl trimyristate, glyceryl tripalmitate, glyceryl tristearate, glyceryl trioleate, and glyceryl trilinoleate. Autodock Vina was utilized to conduct a docking investigation. According to the docking studies, all bacterial lipase had the highest affinity for glyceryl tristearate. To study the stability of binding interaction between bacterial lipase and triglycerides, we run a molecular dynamics simulation based on AMBER. Based on RMSD, RMSF, catalytical distance measurements, and Rgyration analysis data, it was determined that <em>Burkholderia cepacia</em> lipase-glyceryl trioleate and <em>Serratia marcescens</em> lipase-glyceryl trioleate are the most stable interactions. In the future, the insights obtained in this study can be referenced to choose the best candidates for bacterial lipase towards triglycerides substrates and develop engineered lipases to enhance biocatalysis performance.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108899"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GSR-ST: A generalized spatial-temporal framework for genomic signals and regions prediction using multi-scale feature fusion GSR-ST:基于多尺度特征融合的基因组信号和区域预测的广义时空框架。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-15 DOI: 10.1016/j.compbiolchem.2026.108909
Jujuan Zhuang, Ya Lu
Genomic DNA sequences contain diverse functional genomic signals and regions (GSRs) that are crucial for regulating gene expression. The precise identification of these GSRs is fundamental to elucidating genomic architecture and understanding regulatory mechanisms. However, due to the data complexity and heterogeneity, current computational methods remain limited in their predictive accuracy. In this work, we propose a generalized spatial-temporal deep learning framework, GSR-ST, for efficiently identifying three kinds of GSRs: polyadenylation signals (PAS), translation initiation sites (TIS), and promoters. GSR-ST improves the model's predictive performance and generalization ability by integrating multi-scale information from DNA sequences through DNA Bidirectional Encoder Representations from Transformers (DNABERT) pre-trained embeddings and diverse handcrafted features. The framework employs a dual-channel parallel spatial-temporal network architecture to comprehensively capture sequence characteristics. Experimental results demonstrate that GSR-ST substantially outperforms state-of-the-art computational methods in predicting PAS and TIS across multiple eukaryotic species, as well as in predicting promoter for diverse bacterial species. The superior performance of GSR-ST on the independent test sets and its robustness in cross-species validations further confirm its effectiveness. The fusion of pretrained DNABERT embeddings and multiple handcrafted features, leveraged within a spatio-temporal network framework, enables GSR-ST to effectively extract global and local DNA sequence features. This capability makes it a versatile framework for diverse GSRs recognition tasks.
基因组DNA序列包含多种功能基因组信号和区域(GSRs),它们对调控基因表达至关重要。这些gsr的精确鉴定是阐明基因组结构和理解调控机制的基础。然而,由于数据的复杂性和异质性,现有的计算方法在预测精度上仍然受到限制。在这项工作中,我们提出了一个广义的时空深度学习框架,GSR-ST,用于有效识别三种gsr:聚腺苷化信号(PAS),翻译起始位点(TIS)和启动子。GSR-ST通过DNA双向编码器表示(DNA Bidirectional Encoder Representations from Transformers, DNABERT)预训练嵌入和多种手工特征集成DNA序列的多尺度信息,提高了模型的预测性能和泛化能力。该框架采用双通道并行时空网络架构,全面捕获序列特征。实验结果表明,GSR-ST在预测多种真核生物物种的PAS和TIS以及预测多种细菌物种的启动子方面,实质上优于最先进的计算方法。GSR-ST在独立测试集上的优越性能及其在跨物种验证中的稳健性进一步证实了其有效性。在一个时空网络框架内,融合了预训练的DNABERT嵌入和多个手工制作的特征,使GSR-ST能够有效地提取全局和局部DNA序列特征。这种能力使其成为一个适用于各种gsr识别任务的通用框架。
{"title":"GSR-ST: A generalized spatial-temporal framework for genomic signals and regions prediction using multi-scale feature fusion","authors":"Jujuan Zhuang,&nbsp;Ya Lu","doi":"10.1016/j.compbiolchem.2026.108909","DOIUrl":"10.1016/j.compbiolchem.2026.108909","url":null,"abstract":"<div><div>Genomic DNA sequences contain diverse functional genomic signals and regions (GSRs) that are crucial for regulating gene expression. The precise identification of these GSRs is fundamental to elucidating genomic architecture and understanding regulatory mechanisms. However, due to the data complexity and heterogeneity, current computational methods remain limited in their predictive accuracy. In this work, we propose a generalized spatial-temporal deep learning framework, GSR-ST, for efficiently identifying three kinds of GSRs: polyadenylation signals (PAS), translation initiation sites (TIS), and promoters. GSR-ST improves the model's predictive performance and generalization ability by integrating multi-scale information from DNA sequences through DNA Bidirectional Encoder Representations from Transformers (DNABERT) pre-trained embeddings and diverse handcrafted features. The framework employs a dual-channel parallel spatial-temporal network architecture to comprehensively capture sequence characteristics. Experimental results demonstrate that GSR-ST substantially outperforms state-of-the-art computational methods in predicting PAS and TIS across multiple eukaryotic species, as well as in predicting promoter for diverse bacterial species. The superior performance of GSR-ST on the independent test sets and its robustness in cross-species validations further confirm its effectiveness. The fusion of pretrained DNABERT embeddings and multiple handcrafted features, leveraged within a spatio-temporal network framework, enables GSR-ST to effectively extract global and local DNA sequence features. This capability makes it a versatile framework for diverse GSRs recognition tasks.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108909"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146013805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMRCL: An interpretable multi-modal deep learning framework for predicting hERG blockers MMRCL:用于预测hERG阻滞剂的可解释多模态深度学习框架。
IF 3.1 4区 生物学 Q2 BIOLOGY Pub Date : 2026-06-01 Epub Date: 2026-01-31 DOI: 10.1016/j.compbiolchem.2026.108926
Yang Su , Jinzhou Wu , Ao Yang , Yumin Yuan , Wenli Du , Yi Xiang , Weifeng Shen
The human ether-a-go-go-related gene (hERG) encodes a voltage-gated potassium channel essential for cardiac action potential repolarization. Drug-induced hERG inhibition can prolong the QT interval, causing severe heart diseases like torsade de pointes and fatal arrhythmias. In pharmaceutical chemistry, early prediction of hERG blockers is crucial to mitigate cardiotoxicity risks, minimizing drug withdrawals and economic losses in discovery. To address this, an interpretable multi-modal molecular representation cross-learning framework (MMRCL) is developed, integrating multi-dimensional molecular fingerprints and molecular graphs to enrich structural features. MMRCL combines a dual-channel message passing neural network (MPNN) for atom- and bond-level structural features with a multi-layer perceptron for molecular fingerprint-based semantics. A multi-head cross-attention mechanism adaptively fuses features across modalities, enabling deep correlation modeling, followed by a fully connected neural network classifier. Extensive evaluation on an internal dataset (12,518 compounds with high-dimensional fingerprints and graph features) and three external test sets demonstrates MMRCL's superior performance compared to seven state-of-the-art baseline models, achieving the best AUC of 0.8895, PRC of 0.9073, and MCC of 0.6146 on the internal set. Interpretability analysis identifies key toxic substructures linked to hERG-blocking activity, aiding structure-activity relationship exploration. Ablation studies further confirm the contributions of multi-modal input and attention-based fusion. MMRCL achieves superior prediction accuracy and generalization, also enhances model interpretability, providing actionable insights for medicinal chemists.
人类以太相关基因(hERG)编码对心脏动作电位复极至关重要的电压门控钾通道。药物诱导的hERG抑制可延长QT间期,引起点扭转和致死性心律失常等严重心脏病。在药物化学中,早期预测hERG阻滞剂对于降低心脏毒性风险、最大限度地减少药物停药和发现过程中的经济损失至关重要。为了解决这个问题,开发了一个可解释的多模态分子表示交叉学习框架(MMRCL),将多维分子指纹和分子图谱结合起来,丰富了结构特征。MMRCL结合了用于原子和键级结构特征的双通道消息传递神经网络(MPNN)和用于基于分子指纹语义的多层感知器。一个多头交叉注意机制自适应融合跨模式的特征,实现深度关联建模,然后是一个完全连接的神经网络分类器。在内部数据集(12,518种具有高维指纹图谱和图形特征的化合物)和3个外部测试集上进行的广泛评估表明,与7个最先进的基线模型相比,MMRCL的性能优越,在内部数据集上实现了最佳AUC为0.8895,PRC为0.9073,MCC为0.6146。可解释性分析确定了与heg阻断活性相关的关键毒性亚结构,有助于探索结构-活性关系。消融研究进一步证实了多模态输入和基于注意的融合的贡献。MMRCL具有较高的预测精度和泛化能力,增强了模型的可解释性,为药物化学家提供了可操作的见解。
{"title":"MMRCL: An interpretable multi-modal deep learning framework for predicting hERG blockers","authors":"Yang Su ,&nbsp;Jinzhou Wu ,&nbsp;Ao Yang ,&nbsp;Yumin Yuan ,&nbsp;Wenli Du ,&nbsp;Yi Xiang ,&nbsp;Weifeng Shen","doi":"10.1016/j.compbiolchem.2026.108926","DOIUrl":"10.1016/j.compbiolchem.2026.108926","url":null,"abstract":"<div><div>The human ether-a-go-go-related gene (hERG) encodes a voltage-gated potassium channel essential for cardiac action potential repolarization. Drug-induced hERG inhibition can prolong the QT interval, causing severe heart diseases like torsade de pointes and fatal arrhythmias. In pharmaceutical chemistry, early prediction of hERG blockers is crucial to mitigate cardiotoxicity risks, minimizing drug withdrawals and economic losses in discovery. To address this, an interpretable multi-modal molecular representation cross-learning framework (MMRCL) is developed, integrating multi-dimensional molecular fingerprints and molecular graphs to enrich structural features. MMRCL combines a dual-channel message passing neural network (MPNN) for atom- and bond-level structural features with a multi-layer perceptron for molecular fingerprint-based semantics. A multi-head cross-attention mechanism adaptively fuses features across modalities, enabling deep correlation modeling, followed by a fully connected neural network classifier. Extensive evaluation on an internal dataset (12,518 compounds with high-dimensional fingerprints and graph features) and three external test sets demonstrates MMRCL's superior performance compared to seven state-of-the-art baseline models, achieving the best AUC of 0.8895, PRC of 0.9073, and MCC of 0.6146 on the internal set. Interpretability analysis identifies key toxic substructures linked to hERG-blocking activity, aiding structure-activity relationship exploration. Ablation studies further confirm the contributions of multi-modal input and attention-based fusion. MMRCL achieves superior prediction accuracy and generalization, also enhances model interpretability, providing actionable insights for medicinal chemists.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108926"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Biology and Chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1