Pub Date : 2026-06-01Epub Date: 2026-01-29DOI: 10.1016/j.compbiolchem.2026.108932
Mohammed F. Aldawsari , Hisham N. Altayb , Ehssan Moglad
Staphylococcus aureus is a leading cause of both community- and hospital-acquired infections, and the growing prevalence of antimicrobial resistance complicates clinical management worldwide. This study investigated the epidemiology, resistance trends, multidrug resistance (MDR) patterns, and the role of machine learning (ML) in predicting antibiotic susceptibility in Saudi Arabia. A total of 18,003 microbiology reports (2019–2024) were analyzed, identifying 2506 S. aureus isolates. Susceptibility testing included 31 antibiotics representing 11 pharmacological classes. Predictive ML models (Random Forest, Logistic Regression, Gradient Boosting) were trained and evaluated using accuracy, precision, recall, F1-score, and confusion matrices. Wound (24 %) and blood (23 %) were the most frequent sources of S. aureus. High resistance (>70 %) was observed for β-lactams, fluoroquinolones, and macrolides/lincosamides, while glycopeptides, oxazolidinones, and lipopeptides maintained excellent activity (<10 % resistance). MDR occurred in 30 % of isolates, XDR in 0.6 %, and no PDR isolates were detected. Among ML models, Random Forest achieved the best overall performance across most antibiotics, Logistic Regression was optimal for ampicillin, and Gradient Boosting for linezolid. Vancomycin, linezolid, penicillin, and SXT achieved precision and recall above 0.92, demonstrating strong predictive reliability. S. aureus remains a major clinical threat in Saudi Arabia, with high MDR rates but preserved efficacy of last-line antibiotics. This study highlights the value of combining multi-center surveillance with interpretable machine learning approaches to support antimicrobial stewardship, enhance early resistance prediction, and inform data-driven clinical decision-making, particularly in settings where rapid molecular diagnostics may be limited.
{"title":"Predicting antimicrobial resistance in Staphylococcus aureus using machine learning: Insights from a five-year surveillance study","authors":"Mohammed F. Aldawsari , Hisham N. Altayb , Ehssan Moglad","doi":"10.1016/j.compbiolchem.2026.108932","DOIUrl":"10.1016/j.compbiolchem.2026.108932","url":null,"abstract":"<div><div><em>Staphylococcus aureus</em> is a leading cause of both community- and hospital-acquired infections, and the growing prevalence of antimicrobial resistance complicates clinical management worldwide. This study investigated the epidemiology, resistance trends, multidrug resistance (MDR) patterns, and the role of machine learning (ML) in predicting antibiotic susceptibility in Saudi Arabia. A total of 18,003 microbiology reports (2019–2024) were analyzed, identifying 2506 <em>S. aureus</em> isolates. Susceptibility testing included 31 antibiotics representing 11 pharmacological classes. Predictive ML models (Random Forest, Logistic Regression, Gradient Boosting) were trained and evaluated using accuracy, precision, recall, F1-score, and confusion matrices. Wound (24 %) and blood (23 %) were the most frequent sources of <em>S. aureus</em>. High resistance (>70 %) was observed for β-lactams, fluoroquinolones, and macrolides/lincosamides, while glycopeptides, oxazolidinones, and lipopeptides maintained excellent activity (<10 % resistance). MDR occurred in 30 % of isolates, XDR in 0.6 %, and no PDR isolates were detected. Among ML models, Random Forest achieved the best overall performance across most antibiotics, Logistic Regression was optimal for ampicillin, and Gradient Boosting for linezolid. Vancomycin, linezolid, penicillin, and SXT achieved precision and recall above 0.92, demonstrating strong predictive reliability. <em>S. aureus</em> remains a major clinical threat in Saudi Arabia, with high MDR rates but preserved efficacy of last-line antibiotics. This study highlights the value of combining multi-center surveillance with interpretable machine learning approaches to support antimicrobial stewardship, enhance early resistance prediction, and inform data-driven clinical decision-making, particularly in settings where rapid molecular diagnostics may be limited.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108932"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-02-06DOI: 10.1016/j.compbiolchem.2026.108928
Debaleena Samanta, Malavika Bhattacharya
Background
Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism Streptococcus pyogenes which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.
Methods
GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of Streptococcus pyogenes MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.
Results
Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). Streptococcus pyogenes mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and p-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.
{"title":"Pivot gene enrichment analysis of Streptococcus pyogenes specific hyaluronic acid mediated disease prognosis on gastric cancer: Based on bioinformatics study","authors":"Debaleena Samanta, Malavika Bhattacharya","doi":"10.1016/j.compbiolchem.2026.108928","DOIUrl":"10.1016/j.compbiolchem.2026.108928","url":null,"abstract":"<div><h3>Background</h3><div>Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism <em>Streptococcus pyogenes</em> which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.</div></div><div><h3>Methods</h3><div>GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of <em>Streptococcus pyogenes</em> MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.</div></div><div><h3>Results</h3><div>Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). <em>Streptococcus pyogenes</em> mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and <em>p</em>-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108928"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The escalating threat of antimicrobial resistance (AMR) in Mycobacterium tuberculosis highlights the urgent need for innovative therapeutic strategies through searching for novel inhibitor molecules. Plant and marine habitats are rich reservoirs of bioactive compounds. Targeting critical pathways for M. tuberculosis survival, such as cell wall biosynthesis, offers a promising approach for drug discovery. Diaminopimelate, a critical component of the bacterial cell wall, is synthesized through the lysine biosynthesis pathway. Dihydrodipicolinate synthase (Mtb-DapA) is a promising drug target in this pathway. We screened antitubercular natural and marine-derived compounds against Mtb-DapA using molecular docking, molecular dynamics (MD) simulations, Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA) analysis was performed to estimate binding free energies and identify promising inhibitors. In this study, we analysed 633 phytochemicals from the BioPhytMol Database, 210 anti-TB phytochemicals from recent literature, and 406 marine habitat-derived anti-TB compounds. We report top three inhibitors glycyrrhizin, micromeline and lico-isoflavone based on MD simulations and MM-PBSA analysis. Binding free energies of glycyrrhizin, micromeline and lico-isoflavone were −50.39 kcal/mol, −17.90 kcal/mol, −17.88 kcal/mol respectively as revealed through MM-PBSA analysis. Glycyrrhizin emerged as the most potent inhibitor. These findings underscore the therapeutic potential of glycyrrhizin, micromeline and lico-isoflavone as promising candidates for further development thereby offering hope for alternative treatments against M. tuberculosis.
{"title":"Computational screening of natural plant and marine compounds as potential inhibitors of Mycobacterium tuberculosis dihydrodipicolinate synthase","authors":"Swati Meena , Firdaus Fatima , Faizan Abul Qais , Srinivasan Ramachandran","doi":"10.1016/j.compbiolchem.2026.108936","DOIUrl":"10.1016/j.compbiolchem.2026.108936","url":null,"abstract":"<div><div>The escalating threat of antimicrobial resistance (AMR) in <em>Mycobacterium tuberculosis</em> highlights the urgent need for innovative therapeutic strategies through searching for novel inhibitor molecules. Plant and marine habitats are rich reservoirs of bioactive compounds. Targeting critical pathways for <em>M. tuberculosis</em> survival, such as cell wall biosynthesis, offers a promising approach for drug discovery. Diaminopimelate, a critical component of the bacterial cell wall, is synthesized through the lysine biosynthesis pathway. Dihydrodipicolinate synthase (Mtb-DapA) is a promising drug target in this pathway. We screened antitubercular natural and marine-derived compounds against Mtb-DapA using molecular docking, molecular dynamics (MD) simulations, Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA) analysis was performed to estimate binding free energies and identify promising inhibitors. In this study, we analysed 633 phytochemicals from the BioPhytMol Database, 210 anti-TB phytochemicals from recent literature, and 406 marine habitat-derived anti-TB compounds. We report top three inhibitors glycyrrhizin, micromeline and lico-isoflavone based on MD simulations and MM-PBSA analysis. Binding free energies of glycyrrhizin, micromeline and lico-isoflavone were −50.39 kcal/mol, −17.90 kcal/mol, −17.88 kcal/mol respectively as revealed through MM-PBSA analysis. Glycyrrhizin emerged as the most potent inhibitor. These findings underscore the therapeutic potential of glycyrrhizin, micromeline and lico-isoflavone as promising candidates for further development thereby offering hope for alternative treatments against <em>M. tuberculosis</em>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108936"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2025-12-30DOI: 10.1016/j.compbiolchem.2025.108870
Simone Lucà , Francesco Masillo , Zsuzsanna Lipták
Summary:
Prefix-free parsing (Boucher et al., 2019) is a highly effective heuristic for computing text indexes for very large amounts of biological data. The algorithm constructs a data structure, the prefix-free parse (PFP) of the input, consisting of a dictionary and a parse, which is then used to speed up computation of the final index. In this paper, we study the size of the PFP, which we refer to as , and show that it is a powerful tool in its own right. To show this, we present two use cases. We first study the application of as a repetitiveness measure of the input text, and compare it to other currently used repetitiveness measures, including (the number of Lempel–Ziv phrases), (the number of runs of the Burrows–Wheeler Transform), and (the text’s substring complexity). We then turn to the use of as a measure for pangenome openness. In both applications, our results are similar to existing measures, but our tool, in almost all cases, is more efficient than those computing the other measures, both in terms of time and space, sometimes by orders of magnitude. We close the paper with a detailed systematic study of the parameter choice for PFP (window size and modulus ). This gives rise to interesting open questions.
Availability and implementation:
The source code is available at https://github.com/simolucaa/piPFP. The accession codes for all the datasets used and the raw results are available at https://github.com/simolucaa/piPFP_experiments.
无前缀解析(Boucher et al., 2019)是一种非常有效的启发式算法,用于计算大量生物数据的文本索引。该算法构建了一个数据结构,即输入的无前缀解析(PFP),由字典和解析组成,然后用于加快最终索引的计算。在本文中,我们研究了PFP的大小,我们称之为π,并表明它本身就是一个强大的工具。为了说明这一点,我们给出两个用例。我们首先研究了π作为输入文本的重复度量的应用,并将其与目前使用的其他重复度量进行比较,包括z (Lempel-Ziv短语的数量),r (Burrows-Wheeler变换的运行次数)和δ(文本的子字符串复杂度)。然后,我们转向使用π作为泛基因组开放性的度量。在这两个应用程序中,我们的结果与现有的度量相似,但是我们的工具,在几乎所有情况下,比计算其他度量更有效,无论是在时间和空间方面,有时是数量级。最后,我们详细系统地研究了PFP(窗口大小w和模量p)的参数选择。这就产生了有趣的开放性问题。可用性和实现:源代码可从https://github.com/simolucaa/piPFP获得。使用的所有数据集和原始结果的加入代码可在https://github.com/simolucaa/piPFP_experiments获得。
{"title":"Measuring genomic data with prefix-free parsing","authors":"Simone Lucà , Francesco Masillo , Zsuzsanna Lipták","doi":"10.1016/j.compbiolchem.2025.108870","DOIUrl":"10.1016/j.compbiolchem.2025.108870","url":null,"abstract":"<div><h3>Summary:</h3><div>Prefix-free parsing (Boucher et al., 2019) is a highly effective heuristic for computing text indexes for very large amounts of biological data. The algorithm constructs a data structure, the prefix-free parse (PFP) of the input, consisting of a dictionary and a parse, which is then used to speed up computation of the final index. In this paper, we study the <em>size</em> of the PFP, which we refer to as <span><math><mi>π</mi></math></span>, and show that it is a powerful tool in its own right. To show this, we present two use cases. We first study the application of <span><math><mi>π</mi></math></span> as a <em>repetitiveness measure</em> of the input text, and compare it to other currently used repetitiveness measures, including <span><math><mi>z</mi></math></span> (the number of Lempel–Ziv phrases), <span><math><mi>r</mi></math></span> (the number of runs of the Burrows–Wheeler Transform), and <span><math><mi>δ</mi></math></span> (the text’s substring complexity). We then turn to the use of <span><math><mi>π</mi></math></span> as a measure for <em>pangenome openness</em>. In both applications, our results are similar to existing measures, but our tool, in almost all cases, is more efficient than those computing the other measures, both in terms of time and space, sometimes by orders of magnitude. We close the paper with a detailed systematic study of the parameter choice for PFP (window size <span><math><mi>w</mi></math></span> and modulus <span><math><mi>p</mi></math></span>). This gives rise to interesting open questions.</div></div><div><h3>Availability and implementation:</h3><div>The source code is available at <span><span>https://github.com/simolucaa/piPFP</span><svg><path></path></svg></span>. The accession codes for all the datasets used and the raw results are available at <span><span>https://github.com/simolucaa/piPFP_experiments</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108870"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-08DOI: 10.1016/j.compbiolchem.2026.108880
Lu Shen , Feng Hu , Libing Bai
Drug-drug interactions represent a key problem for drug research, development, and clinical practice. It is crucial to accurately predict interactions when drugs combine to improve treatment safety and optimize medication regimens. However, the exponential increase in potential drug combinations, along with the limitations of conventional graph and multi-layer network prediction models— which primarily capture only binary relationships between drugs and struggle to represent multi-element synergistic interactions—limits prediction performance. To overcome these challenges, this paper proposes a Multi-Layer Hypergraph framework for drug interaction prediction using Transformer and Hypergraph Convolution (MLHTHC). This framework first constructs a multi-layer similarity hypergraph of drugs based on four attribute types: chemical structure, ATC code, drug category, and corresponding targets. Using drug-drug interaction data from KEGG database as a benchmark, the spectral Hamming similarity method is adopted to calculate the structural similarity between the constructed hypergraph and the benchmark hypergraph, enabling the determination of the importance weight for each hypergraph layer. Subsequently, a hypergraph convolutional neural network performs network embedding on each layer of drug nodes; the Transformer model is used to weight and fuse the multi-layer features; and finally, The multi-layer perceptron (MLP) is used to predict drug-drug interactions (DDIs).Experimental results demonstrate that this model outperforms existing methods such as DPSP and DANN, with the integration of Transformer and hypergraph convolution significantly enhancing prediction accuracy. This approach provides an effective tool for drug-drug interaction prediction.
{"title":"A multi-layer hypergraph framework for drug-drug interaction prediction based on transformer and hypergraph convolution","authors":"Lu Shen , Feng Hu , Libing Bai","doi":"10.1016/j.compbiolchem.2026.108880","DOIUrl":"10.1016/j.compbiolchem.2026.108880","url":null,"abstract":"<div><div>Drug-drug interactions represent a key problem for drug research, development, and clinical practice. It is crucial to accurately predict interactions when drugs combine to improve treatment safety and optimize medication regimens. However, the exponential increase in potential drug combinations, along with the limitations of conventional graph and multi-layer network prediction models— which primarily capture only binary relationships between drugs and struggle to represent multi-element synergistic interactions—limits prediction performance. To overcome these challenges, this paper proposes a Multi-Layer Hypergraph framework for drug interaction prediction using Transformer and Hypergraph Convolution (MLHTHC). This framework first constructs a multi-layer similarity hypergraph of drugs based on four attribute types: chemical structure, ATC code, drug category, and corresponding targets. Using drug-drug interaction data from KEGG database as a benchmark, the spectral Hamming similarity method is adopted to calculate the structural similarity between the constructed hypergraph and the benchmark hypergraph, enabling the determination of the importance weight for each hypergraph layer. Subsequently, a hypergraph convolutional neural network performs network embedding on each layer of drug nodes; the Transformer model is used to weight and fuse the multi-layer features; and finally, The multi-layer perceptron (MLP) is used to predict drug-drug interactions (DDIs).Experimental results demonstrate that this model outperforms existing methods such as DPSP and DANN, with the integration of Transformer and hypergraph convolution significantly enhancing prediction accuracy. This approach provides an effective tool for drug-drug interaction prediction.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108880"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-17DOI: 10.1016/j.compbiolchem.2026.108910
Mohamad Norisham Mohamad Rosdi , Mohd Nurhafizam Karuning , Nur Hanisah Azmi , Mohamad Hafizi Abu Bakar , Yanty Noorziana Abdul Manaf , Feri Eko Hermanto , Aniza Saini , Mohd Azrie Awang , Zainul Amiruddin Zakaria
Passiflora edulis peels consist of considerable antioxidative potential, which attributed to their diverse bioactive components. Nevertheless, these substances are susceptible to thermal degradation which can diminish their usefulness, resulting in resource wastage. This current research explore the influence of drying under varying temperature conditions (room temperature (∼28 °C), 40°C, and 70°C) on the antioxidant properties and metabolite composition of P. edulis peel extracts. A comprehensive analytical approach was adopted, encompassing proximate analysis, vitamin C quantification, total phenolic and flavonoid determinations, free radical scavenging assays, metabolite profiling, network pharmacology, molecular docking, and molecular dynamics simulation. In this study, the content of crude fibre and primary metabolites including fat, protein and carbohydrate were shown to be affected by the elevating drying temperature. Likewise, extract of P. edulis peels dried at room temperature established significant antioxidant activity at 1 mg/mL, inhibiting 2,2-diphenyl-1-picrylhydrazyl radicals (DPPH•) by 81.20 % and 2,2’-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) radicals (ABTS⁺•) by 83.52 %. The content of secondary metabolites such as phenolics and flavonoids was also shown to be affected by temperature, which peels dried at room temperature harbour substantial phenolics and flavonoids content values, 23.71 ± 3.86 mg GAE/g and 35.43 ± 0.10 mg QE/g. The results from metabolite profiling analysis via LC-MS QTOF discovered that the room temperature extract contains 18 potential compounds, including oleamide, 6E,9E-octadecadienoic acid, C16 sphinganine, dodecanamide, and 2-hexyl-decanoic acid. Swiss Target Prediction was employed to identify hypothetical molecular targets, while oxidative stress-related targets were retrieved from the DrugBank, GeneCards, and DisGENET databases. A component–target-pathway network was constructed, encompassing 12 bioactive compounds after initial ADMET screening and 10 hub genes namely TP53, AKT1, CASP3, BCL2, STAT3, HSP90AA1, HSP90AB1, BCL2L1, ESR1, and MDM2. The identified potential antioxidant-related pathways included intrinsic apoptotic signalling, mitochondrial membrane organisation, and mitochondrial transport, among others. Structure-based virtual screening through molecular docking revealed that (S)-2-Hydroxy-2-phenylacetonitrile O-b-D-allopyranoside exhibited significant interaction with HSP90AB1, resulting in a binding affinity of −8.4 kcal/mol. These findings reinforce the pharmacological relevance of P. edulis peels as a high-value reservoir of potential antioxidant substances suitable for the development of functional foods and drugs for disease prevention and health promotion.
{"title":"Influence of drying temperature on the metabolites profile and potential antioxidant pathways of Passiflora edulis peel: Integrating untargeted metabolomics with network pharmacology analyses, molecular docking, and molecular dynamics simulation","authors":"Mohamad Norisham Mohamad Rosdi , Mohd Nurhafizam Karuning , Nur Hanisah Azmi , Mohamad Hafizi Abu Bakar , Yanty Noorziana Abdul Manaf , Feri Eko Hermanto , Aniza Saini , Mohd Azrie Awang , Zainul Amiruddin Zakaria","doi":"10.1016/j.compbiolchem.2026.108910","DOIUrl":"10.1016/j.compbiolchem.2026.108910","url":null,"abstract":"<div><div><em>Passiflora edulis</em> peels consist of considerable antioxidative potential, which attributed to their diverse bioactive components. Nevertheless, these substances are susceptible to thermal degradation which can diminish their usefulness, resulting in resource wastage. This current research explore the influence of drying under varying temperature conditions (room temperature (∼28 °C), 40°C, and 70°C) on the antioxidant properties and metabolite composition of <em>P. edulis</em> peel extracts. A comprehensive analytical approach was adopted, encompassing proximate analysis, vitamin C quantification, total phenolic and flavonoid determinations, free radical scavenging assays, metabolite profiling, network pharmacology, molecular docking, and molecular dynamics simulation. In this study, the content of crude fibre and primary metabolites including fat, protein and carbohydrate were shown to be affected by the elevating drying temperature. Likewise, extract of <em>P. edulis</em> peels dried at room temperature established significant antioxidant activity at 1 mg/mL, inhibiting 2,2-diphenyl-1-picrylhydrazyl radicals (DPPH•) by 81.20 % and 2,2’-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) radicals (ABTS⁺•) by 83.52 %. The content of secondary metabolites such as phenolics and flavonoids was also shown to be affected by temperature, which peels dried at room temperature harbour substantial phenolics and flavonoids content values, 23.71 ± 3.86 mg GAE/g and 35.43 ± 0.10 mg QE/g. The results from metabolite profiling analysis via LC-MS QTOF discovered that the room temperature extract contains 18 potential compounds, including oleamide, 6E,9E-octadecadienoic acid, C16 sphinganine, dodecanamide, and 2-hexyl-decanoic acid. Swiss Target Prediction was employed to identify hypothetical molecular targets, while oxidative stress-related targets were retrieved from the DrugBank, GeneCards, and DisGENET databases. A component–target-pathway network was constructed, encompassing 12 bioactive compounds after initial ADMET screening and 10 hub genes namely TP53, AKT1, CASP3, BCL2, STAT3, HSP90AA1, HSP90AB1, BCL2L1, ESR1, and MDM2. The identified potential antioxidant-related pathways included intrinsic apoptotic signalling, mitochondrial membrane organisation, and mitochondrial transport, among others. Structure-based virtual screening through molecular docking revealed that (S)-2-Hydroxy-2-phenylacetonitrile O-b-<span>D</span>-allopyranoside exhibited significant interaction with HSP90AB1, resulting in a binding affinity of −8.4 kcal/mol. These findings reinforce the pharmacological relevance of <em>P. edulis</em> peels as a high-value reservoir of potential antioxidant substances suitable for the development of functional foods and drugs for disease prevention and health promotion.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108910"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146035278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-20DOI: 10.1016/j.compbiolchem.2026.108913
Feng Ruiping , Luo Junwei , Liu Kaihua , Guo Fei
Background
More accurate identification of topologically associating domains (TADs) is crucial for understanding chromatin spatial organization and gene regulation. Although many computational methods have been developed for TAD detection, existing approaches still exhibit several key limitations, such as difficulty in simultaneously modeling local and global interaction patterns, reliance on manually tuned parameters that lead to unstable boundary granularity, and high sensitivity to resolution and cell type. These issues result in unstable boundaries and high sensitivity to noise, thereby affecting the overall biological interpretability of TAD structures.
Results
To address these challenges, we propose a novel unsupervised TAD identification method, CCTAD, which for the first time integrates a one-dimensional convolutional autoencoder (1D-CAE) with connectivity-constrained hierarchical clustering. The 1D-CAE automatically extracts high-quality, low-dimensional feature representations from Hi-C contact matrices in an unsupervised manner, effectively capturing both local and global patterns of chromatin interactions. These learned features are then partitioned using a hierarchical clustering strategy augmented with genomic adjacency constraints, which simultaneously preserves the continuity of TADs along the linear genome and adaptively optimizes clustering granularity. This design overcomes the limitations of traditional methods in terms of boundary precision and stability. As a result, CCTAD produces TAD delineations that are more biologically meaningful and robust across different resolutions. The source code is available on GitHub at https://github.com/ruiping-Feng/CCTAD.
Conclusions
Evaluation of CCTAD across multiple cell lines and resolutions demonstrates its advantages in boundary identification of key regulatory elements.
{"title":"CCTAD: A topologically associating domains detection method integrating convolutional autoencoder and hierarchical clustering","authors":"Feng Ruiping , Luo Junwei , Liu Kaihua , Guo Fei","doi":"10.1016/j.compbiolchem.2026.108913","DOIUrl":"10.1016/j.compbiolchem.2026.108913","url":null,"abstract":"<div><h3>Background</h3><div>More accurate identification of topologically associating domains (TADs) is crucial for understanding chromatin spatial organization and gene regulation. Although many computational methods have been developed for TAD detection, existing approaches still exhibit several key limitations, such as difficulty in simultaneously modeling local and global interaction patterns, reliance on manually tuned parameters that lead to unstable boundary granularity, and high sensitivity to resolution and cell type. These issues result in unstable boundaries and high sensitivity to noise, thereby affecting the overall biological interpretability of TAD structures.</div></div><div><h3>Results</h3><div>To address these challenges, we propose a novel unsupervised TAD identification method, CCTAD, which for the first time integrates a one-dimensional convolutional autoencoder (1D-CAE) with connectivity-constrained hierarchical clustering. The 1D-CAE automatically extracts high-quality, low-dimensional feature representations from Hi-C contact matrices in an unsupervised manner, effectively capturing both local and global patterns of chromatin interactions. These learned features are then partitioned using a hierarchical clustering strategy augmented with genomic adjacency constraints, which simultaneously preserves the continuity of TADs along the linear genome and adaptively optimizes clustering granularity. This design overcomes the limitations of traditional methods in terms of boundary precision and stability. As a result, CCTAD produces TAD delineations that are more biologically meaningful and robust across different resolutions. The source code is available on GitHub at <span><span>https://github.com/ruiping-Feng/CCTAD</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusions</h3><div>Evaluation of CCTAD across multiple cell lines and resolutions demonstrates its advantages in boundary identification of key regulatory elements.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108913"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146034877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-11DOI: 10.1016/j.compbiolchem.2026.108899
Titin Haryati , Norman Yoshi Haryono , Bernardinus Parlindungan Atmaka , Fira Alisya Nur Azizah , Akhmaloka , Muhammad Irfan
Bacterial lipase has thermostability and solvent stability making it suitable for development as a biodiesel catalyst. Biodiesel industry relies on triglycerides as the main substrate. Therefore, to find the best bacterial lipase activity towards triglycerides substrate, efficient screening activity methods can be used throughout the silico study. Bacterial lipases used in this investigation are Pseudomonas aeruginosa lipase, Burkholderia cepacia lipase, Serratia marcescens lipase, and Bacillus pumilus lipase. UniProtKB was used to retrieve these four bacterial lipases, which were then modeled in three dimensions using homology methods using Alphafold2. Those four bacterial lipase were docking against triglycerides substrates, such as glyceryl tridecanoate, glyceryl trilaurate, glyceryl trimyristate, glyceryl tripalmitate, glyceryl tristearate, glyceryl trioleate, and glyceryl trilinoleate. Autodock Vina was utilized to conduct a docking investigation. According to the docking studies, all bacterial lipase had the highest affinity for glyceryl tristearate. To study the stability of binding interaction between bacterial lipase and triglycerides, we run a molecular dynamics simulation based on AMBER. Based on RMSD, RMSF, catalytical distance measurements, and Rgyration analysis data, it was determined that Burkholderia cepacia lipase-glyceryl trioleate and Serratia marcescens lipase-glyceryl trioleate are the most stable interactions. In the future, the insights obtained in this study can be referenced to choose the best candidates for bacterial lipase towards triglycerides substrates and develop engineered lipases to enhance biocatalysis performance.
{"title":"Molecular modelling, docking, and MD simulation of bacterial lipase: Binding interaction investigation against triglycerides","authors":"Titin Haryati , Norman Yoshi Haryono , Bernardinus Parlindungan Atmaka , Fira Alisya Nur Azizah , Akhmaloka , Muhammad Irfan","doi":"10.1016/j.compbiolchem.2026.108899","DOIUrl":"10.1016/j.compbiolchem.2026.108899","url":null,"abstract":"<div><div>Bacterial lipase has thermostability and solvent stability making it suitable for development as a biodiesel catalyst. Biodiesel industry relies on triglycerides as the main substrate. Therefore, to find the best bacterial lipase activity towards triglycerides substrate, efficient screening activity methods can be used throughout the silico study. Bacterial lipases used in this investigation are <em>Pseudomonas aeruginosa</em> lipase, <em>Burkholderia cepacia</em> lipase, <em>Serratia marcescens</em> lipase, and <em>Bacillus pumilus</em> lipase. UniProtKB was used to retrieve these four bacterial lipases, which were then modeled in three dimensions using homology methods using Alphafold2. Those four bacterial lipase were docking against triglycerides substrates, such as glyceryl tridecanoate, glyceryl trilaurate, glyceryl trimyristate, glyceryl tripalmitate, glyceryl tristearate, glyceryl trioleate, and glyceryl trilinoleate. Autodock Vina was utilized to conduct a docking investigation. According to the docking studies, all bacterial lipase had the highest affinity for glyceryl tristearate. To study the stability of binding interaction between bacterial lipase and triglycerides, we run a molecular dynamics simulation based on AMBER. Based on RMSD, RMSF, catalytical distance measurements, and Rgyration analysis data, it was determined that <em>Burkholderia cepacia</em> lipase-glyceryl trioleate and <em>Serratia marcescens</em> lipase-glyceryl trioleate are the most stable interactions. In the future, the insights obtained in this study can be referenced to choose the best candidates for bacterial lipase towards triglycerides substrates and develop engineered lipases to enhance biocatalysis performance.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108899"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-15DOI: 10.1016/j.compbiolchem.2026.108909
Jujuan Zhuang, Ya Lu
Genomic DNA sequences contain diverse functional genomic signals and regions (GSRs) that are crucial for regulating gene expression. The precise identification of these GSRs is fundamental to elucidating genomic architecture and understanding regulatory mechanisms. However, due to the data complexity and heterogeneity, current computational methods remain limited in their predictive accuracy. In this work, we propose a generalized spatial-temporal deep learning framework, GSR-ST, for efficiently identifying three kinds of GSRs: polyadenylation signals (PAS), translation initiation sites (TIS), and promoters. GSR-ST improves the model's predictive performance and generalization ability by integrating multi-scale information from DNA sequences through DNA Bidirectional Encoder Representations from Transformers (DNABERT) pre-trained embeddings and diverse handcrafted features. The framework employs a dual-channel parallel spatial-temporal network architecture to comprehensively capture sequence characteristics. Experimental results demonstrate that GSR-ST substantially outperforms state-of-the-art computational methods in predicting PAS and TIS across multiple eukaryotic species, as well as in predicting promoter for diverse bacterial species. The superior performance of GSR-ST on the independent test sets and its robustness in cross-species validations further confirm its effectiveness. The fusion of pretrained DNABERT embeddings and multiple handcrafted features, leveraged within a spatio-temporal network framework, enables GSR-ST to effectively extract global and local DNA sequence features. This capability makes it a versatile framework for diverse GSRs recognition tasks.
基因组DNA序列包含多种功能基因组信号和区域(GSRs),它们对调控基因表达至关重要。这些gsr的精确鉴定是阐明基因组结构和理解调控机制的基础。然而,由于数据的复杂性和异质性,现有的计算方法在预测精度上仍然受到限制。在这项工作中,我们提出了一个广义的时空深度学习框架,GSR-ST,用于有效识别三种gsr:聚腺苷化信号(PAS),翻译起始位点(TIS)和启动子。GSR-ST通过DNA双向编码器表示(DNA Bidirectional Encoder Representations from Transformers, DNABERT)预训练嵌入和多种手工特征集成DNA序列的多尺度信息,提高了模型的预测性能和泛化能力。该框架采用双通道并行时空网络架构,全面捕获序列特征。实验结果表明,GSR-ST在预测多种真核生物物种的PAS和TIS以及预测多种细菌物种的启动子方面,实质上优于最先进的计算方法。GSR-ST在独立测试集上的优越性能及其在跨物种验证中的稳健性进一步证实了其有效性。在一个时空网络框架内,融合了预训练的DNABERT嵌入和多个手工制作的特征,使GSR-ST能够有效地提取全局和局部DNA序列特征。这种能力使其成为一个适用于各种gsr识别任务的通用框架。
{"title":"GSR-ST: A generalized spatial-temporal framework for genomic signals and regions prediction using multi-scale feature fusion","authors":"Jujuan Zhuang, Ya Lu","doi":"10.1016/j.compbiolchem.2026.108909","DOIUrl":"10.1016/j.compbiolchem.2026.108909","url":null,"abstract":"<div><div>Genomic DNA sequences contain diverse functional genomic signals and regions (GSRs) that are crucial for regulating gene expression. The precise identification of these GSRs is fundamental to elucidating genomic architecture and understanding regulatory mechanisms. However, due to the data complexity and heterogeneity, current computational methods remain limited in their predictive accuracy. In this work, we propose a generalized spatial-temporal deep learning framework, GSR-ST, for efficiently identifying three kinds of GSRs: polyadenylation signals (PAS), translation initiation sites (TIS), and promoters. GSR-ST improves the model's predictive performance and generalization ability by integrating multi-scale information from DNA sequences through DNA Bidirectional Encoder Representations from Transformers (DNABERT) pre-trained embeddings and diverse handcrafted features. The framework employs a dual-channel parallel spatial-temporal network architecture to comprehensively capture sequence characteristics. Experimental results demonstrate that GSR-ST substantially outperforms state-of-the-art computational methods in predicting PAS and TIS across multiple eukaryotic species, as well as in predicting promoter for diverse bacterial species. The superior performance of GSR-ST on the independent test sets and its robustness in cross-species validations further confirm its effectiveness. The fusion of pretrained DNABERT embeddings and multiple handcrafted features, leveraged within a spatio-temporal network framework, enables GSR-ST to effectively extract global and local DNA sequence features. This capability makes it a versatile framework for diverse GSRs recognition tasks.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108909"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146013805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-31DOI: 10.1016/j.compbiolchem.2026.108926
Yang Su , Jinzhou Wu , Ao Yang , Yumin Yuan , Wenli Du , Yi Xiang , Weifeng Shen
The human ether-a-go-go-related gene (hERG) encodes a voltage-gated potassium channel essential for cardiac action potential repolarization. Drug-induced hERG inhibition can prolong the QT interval, causing severe heart diseases like torsade de pointes and fatal arrhythmias. In pharmaceutical chemistry, early prediction of hERG blockers is crucial to mitigate cardiotoxicity risks, minimizing drug withdrawals and economic losses in discovery. To address this, an interpretable multi-modal molecular representation cross-learning framework (MMRCL) is developed, integrating multi-dimensional molecular fingerprints and molecular graphs to enrich structural features. MMRCL combines a dual-channel message passing neural network (MPNN) for atom- and bond-level structural features with a multi-layer perceptron for molecular fingerprint-based semantics. A multi-head cross-attention mechanism adaptively fuses features across modalities, enabling deep correlation modeling, followed by a fully connected neural network classifier. Extensive evaluation on an internal dataset (12,518 compounds with high-dimensional fingerprints and graph features) and three external test sets demonstrates MMRCL's superior performance compared to seven state-of-the-art baseline models, achieving the best AUC of 0.8895, PRC of 0.9073, and MCC of 0.6146 on the internal set. Interpretability analysis identifies key toxic substructures linked to hERG-blocking activity, aiding structure-activity relationship exploration. Ablation studies further confirm the contributions of multi-modal input and attention-based fusion. MMRCL achieves superior prediction accuracy and generalization, also enhances model interpretability, providing actionable insights for medicinal chemists.
{"title":"MMRCL: An interpretable multi-modal deep learning framework for predicting hERG blockers","authors":"Yang Su , Jinzhou Wu , Ao Yang , Yumin Yuan , Wenli Du , Yi Xiang , Weifeng Shen","doi":"10.1016/j.compbiolchem.2026.108926","DOIUrl":"10.1016/j.compbiolchem.2026.108926","url":null,"abstract":"<div><div>The human ether-a-go-go-related gene (hERG) encodes a voltage-gated potassium channel essential for cardiac action potential repolarization. Drug-induced hERG inhibition can prolong the QT interval, causing severe heart diseases like torsade de pointes and fatal arrhythmias. In pharmaceutical chemistry, early prediction of hERG blockers is crucial to mitigate cardiotoxicity risks, minimizing drug withdrawals and economic losses in discovery. To address this, an interpretable multi-modal molecular representation cross-learning framework (MMRCL) is developed, integrating multi-dimensional molecular fingerprints and molecular graphs to enrich structural features. MMRCL combines a dual-channel message passing neural network (MPNN) for atom- and bond-level structural features with a multi-layer perceptron for molecular fingerprint-based semantics. A multi-head cross-attention mechanism adaptively fuses features across modalities, enabling deep correlation modeling, followed by a fully connected neural network classifier. Extensive evaluation on an internal dataset (12,518 compounds with high-dimensional fingerprints and graph features) and three external test sets demonstrates MMRCL's superior performance compared to seven state-of-the-art baseline models, achieving the best AUC of 0.8895, PRC of 0.9073, and MCC of 0.6146 on the internal set. Interpretability analysis identifies key toxic substructures linked to hERG-blocking activity, aiding structure-activity relationship exploration. Ablation studies further confirm the contributions of multi-modal input and attention-based fusion. MMRCL achieves superior prediction accuracy and generalization, also enhances model interpretability, providing actionable insights for medicinal chemists.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108926"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}