Pub Date : 2026-05-01Epub Date: 2026-02-12DOI: 10.1016/j.cmpb.2026.109282
Julia Rahman , M.A. Hakim Newton , Jiffriya Mohamed Abdul Cader , Md Khaled Ben Islam , Mohammed Eunus Ali , Abdul Sattar
Background:
Protein–ligand binding affinity prediction is essential in structure-based drug design, where binding scores guide the selection of promising candidate ligands. Existing deep learning models often use 3D grids, voxelized complexes, or molecular graphs. These representations are resource-intensive and may not capture specific directional interactions.
Objective:
This paper introduces angular geometric features as key descriptors of binding interactions.
Methods:
Seven types of dihedral angles between protein and ligand atoms are extracted to encode orientation and geometry. A fully connected ensemble network, called the Angle-Aware Predictor (AAP), integrates these features.
Results:
On CASF-2016, AAP achieves state-of-the-art results with correlation coefficient (R) of 0.872, root mean squared error (RMSE) of 1.072, mean absolute error (MAE) 0.817, standard deviation (SD) of 1.077, and concordance index (CI) of 0.845. On four additional benchmarks, AAP shows consistent improvements ranging from 0.3% to 36%.
Conclusion:
The angular features are effective, lightweight, and robust descriptors for binding affinity prediction. These results highlight angular geometry as a valuable direction for future structure-based drug discovery. The program and data of AAP are publicly available at https://github.com/juliacse06/AAP.
{"title":"Harnessing angular geometry in deep learning for protein–ligand binding affinity prediction","authors":"Julia Rahman , M.A. Hakim Newton , Jiffriya Mohamed Abdul Cader , Md Khaled Ben Islam , Mohammed Eunus Ali , Abdul Sattar","doi":"10.1016/j.cmpb.2026.109282","DOIUrl":"10.1016/j.cmpb.2026.109282","url":null,"abstract":"<div><h3>Background:</h3><div>Protein–ligand binding affinity prediction is essential in structure-based drug design, where binding scores guide the selection of promising candidate ligands. Existing deep learning models often use 3D grids, voxelized complexes, or molecular graphs. These representations are resource-intensive and may not capture specific directional interactions.</div></div><div><h3>Objective:</h3><div>This paper introduces angular geometric features as key descriptors of binding interactions.</div></div><div><h3>Methods:</h3><div>Seven types of dihedral angles between protein and ligand atoms are extracted to encode orientation and geometry. A fully connected ensemble network, called the Angle-Aware Predictor (AAP), integrates these features.</div></div><div><h3>Results:</h3><div>On CASF-2016, AAP achieves state-of-the-art results with correlation coefficient (R) of 0.872, root mean squared error (RMSE) of 1.072, mean absolute error (MAE) 0.817, standard deviation (SD) of 1.077, and concordance index (CI) of 0.845. On four additional benchmarks, AAP shows consistent improvements ranging from 0.3% to 36%.</div></div><div><h3>Conclusion:</h3><div>The angular features are effective, lightweight, and robust descriptors for binding affinity prediction. These results highlight angular geometry as a valuable direction for future structure-based drug discovery. The program and data of AAP are publicly available at <span><span>https://github.com/juliacse06/AAP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109282"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146186281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-29DOI: 10.1016/j.cmpb.2026.109265
Shichen Zhang , Dinghan Hu , Le Luo , Jiuwen Cao
Background and Objective:
The diagnosis of carotid plaques plays an important role in revealing cardiovascular and cerebrovascular diseases, thus attracting widespread research attention. However, most medical examinations rely heavily on specialists and carotid ultrasound images, which are time-consuming, radiative, expensive and limited in tracking disease progression. To alleviate these deficiency, inspired by the human blood supply sequence, a detailed study on the association between carotid plaque and ocular surface image features is proposed in the paper.
Methods:
This paper systematically verifies the correlation between carotid plaque and ocular surface image through a multi-dimensional feature analysis approach incorporating texture, frequency domain features, and color characteristics. The analysis combines feature selection, confidence evaluation, and distribution property studies to establish robust associations. Besides, multiple machine learning classifiers are used to evaluate the robustness of the extracted features, with subgroup validation conducted across different subsets, systematically assessing the influence of age and gender factors.
Results:
The proposed method achieves high prediction accuracy on 8875 individuals from Hangzhou Wuyunshan Hospital (Hangzhou Institute for Health Promotion), with electronic health record (EHR) features showing the strongest association (Odds Ratios [ORs]: 4.35 [3.90-4.86] in males; 2.92 [2.60-3.27] in females). Experimental results demonstrate that age, male gender, and ocular surface image features – including EHR, local binary patterns (LBP), gray-level gradient co-occurrence matrix (GLGCM), and gray-level co-occurrence matrix (GLCM) – show strong associations with carotid plaque, where LBP and EHR features are selected most frequently.
Conclusions:
Ocular surface image analysis offers a practical and non-invasive method for carotid plaque screening. The observed feature associations and strong predictive performance highlight its potential for clinical applications, especially in large-scale population screening.
背景与目的:颈动脉斑块的诊断在揭示心脑血管疾病中起着重要的作用,引起了广泛的研究关注。然而,大多数医学检查严重依赖于专家和颈动脉超声图像,这是耗时的,辐射的,昂贵的,并且在追踪疾病进展方面受到限制。为了缓解这些不足,受人体血液供应顺序的启发,本文提出对颈动脉斑块与眼表图像特征之间的关系进行详细的研究。方法:通过结合纹理特征、频域特征和颜色特征的多维特征分析方法,系统验证颈动脉斑块与眼表图像的相关性。分析结合了特征选择、置信度评估和分布属性研究,以建立稳健的关联。此外,使用多个机器学习分类器来评估提取的特征的鲁棒性,并在不同的子集上进行子组验证,系统地评估年龄和性别因素的影响。结果:本文提出的方法对杭州市武云山医院(杭州市健康促进研究所)8875例个体的预测准确率较高,其中电子病历(electronic Health record, EHR)特征的相关性最强(比值比男性为4.35[3.90-4.86],女性为2.92[2.60-3.27])。实验结果表明,年龄、男性和眼表图像特征(包括EHR、局部二值模式(LBP)、灰度梯度共现矩阵(GLGCM)和灰度共现矩阵(GLCM))与颈动脉斑块有很强的相关性,其中LBP和EHR特征被选择的频率最高。结论:眼表图像分析为颈动脉斑块筛查提供了一种实用、无创的方法。观察到的特征关联和强大的预测性能突出了其临床应用潜力,特别是在大规模人群筛查中。
{"title":"Correlative analysis between ocular surface features and carotid plaque : A multimodal machine learning framework","authors":"Shichen Zhang , Dinghan Hu , Le Luo , Jiuwen Cao","doi":"10.1016/j.cmpb.2026.109265","DOIUrl":"10.1016/j.cmpb.2026.109265","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>The diagnosis of carotid plaques plays an important role in revealing cardiovascular and cerebrovascular diseases, thus attracting widespread research attention. However, most medical examinations rely heavily on specialists and carotid ultrasound images, which are time-consuming, radiative, expensive and limited in tracking disease progression. To alleviate these deficiency, inspired by the human blood supply sequence, a detailed study on the association between carotid plaque and ocular surface image features is proposed in the paper.</div></div><div><h3>Methods:</h3><div>This paper systematically verifies the correlation between carotid plaque and ocular surface image through a multi-dimensional feature analysis approach incorporating texture, frequency domain features, and color characteristics. The analysis combines feature selection, confidence evaluation, and distribution property studies to establish robust associations. Besides, multiple machine learning classifiers are used to evaluate the robustness of the extracted features, with subgroup validation conducted across different subsets, systematically assessing the influence of age and gender factors.</div></div><div><h3>Results:</h3><div>The proposed method achieves high prediction accuracy on 8875 individuals from Hangzhou Wuyunshan Hospital (Hangzhou Institute for Health Promotion), with electronic health record (EHR) features showing the strongest association (Odds Ratios [ORs]: 4.35 [3.90-4.86] in males; 2.92 [2.60-3.27] in females). Experimental results demonstrate that age, male gender, and ocular surface image features – including EHR, local binary patterns (LBP), gray-level gradient co-occurrence matrix (GLGCM), and gray-level co-occurrence matrix (GLCM) – show strong associations with carotid plaque, where LBP and EHR features are selected most frequently.</div></div><div><h3>Conclusions:</h3><div>Ocular surface image analysis offers a practical and non-invasive method for carotid plaque screening. The observed feature associations and strong predictive performance highlight its potential for clinical applications, especially in large-scale population screening.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109265"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-09DOI: 10.1016/j.cmpb.2026.109283
Ce Shi , Lina Ding , Dandan Wang , Jing Zhu , Xiaoli Zheng , Lansheng Zhang , Yun Zhou
Background
Glioblastoma multiforme (GBM) and ischemic stroke (IS) are two major neurological disorders contributing substantially to global mortality and disability. GBM elevates IS risk via prothrombotic mechanisms, while IS may accelerate glioma progression through ischemia-driven neuroinflammation. Identifying shared molecular mediators is essential for understanding their bidirectional pathophysiology.
Methods
A systems biology approach was implemented to investigate shared neurotrophic factor-related genes (NFRGs) between GBM and IS. A total of 2871 NFRGs were screened from Genecards, with Caspase-3 (CASP3) and Protein Arginine N-Methyltransferase 6 (PRMT6) identified as core regulators. Multi-omics validation included: 1) Differential expression profiling across The Cancer Genome Atlas (TCGA)-GBM and Gene Expression Omnibus (GEO) stroke datasets; 2) Prognostic stratification using Kaplan-Meier (KM) survival curves with log-rank test and Cox proportional hazards regression; 3) Immune microenvironment analysis via CIBERSORT; 4) Experimental validation in middle cerebral artery occlusion (MCAO) mice and GBM cell lines (U87MG, T98G, A172) using Real-Time Quantitative Reverse Transcription PCR (qRT-PCR), Western blot (WB), and immunofluorescence (IF).
Results
CASP3 and PRMT6 were significantly upregulated in both GBM and IS (P < 0.05). KM survival analysis with log-rank test showed that high expression of CASP3 and PRMT6 was strongly associated with poorer overall survival (OS) in GBM patients (P < 0.001; Hazard Ratio (HR) = 4.375, 95% Confidence Interval (CI) = 3.336–5.738 for CASP3; HR = 4.547, 95% CI = 3.429–6.029 for PRMT6). Receiver operating characteristic (ROC) analysis confirmed robust diagnostic (Area Under the Curve (AUC) > 0.7) and prognostic efficacy for both markers. IF validated their elevated expression in ischemic brain tissues of Middle Cerebral Artery Occlusion (MCAO) mice, while qRT-PCR and WB confirmed higher expression in GBM cells versus normal glial cells. Immune infiltration analysis indicated that CASP3 and PRMT6 are associated with immunosuppressive remodeling in GBM, suggesting their role as a molecular bridge between the two diseases.
Conclusions
Our findings identify CASP3 and PRMT6 as dual molecular mediators coordinating GBM progression and post-IS pathological processes. Targeting these genes may provide novel therapeutic avenues for preventing GBM-associated IS and improving neuro-oncological outcomes.
{"title":"Identifying neurotrophic factor related genes at the crosstalk between glioblastoma and ischemic stroke","authors":"Ce Shi , Lina Ding , Dandan Wang , Jing Zhu , Xiaoli Zheng , Lansheng Zhang , Yun Zhou","doi":"10.1016/j.cmpb.2026.109283","DOIUrl":"10.1016/j.cmpb.2026.109283","url":null,"abstract":"<div><h3>Background</h3><div>Glioblastoma multiforme (GBM) and ischemic stroke (IS) are two major neurological disorders contributing substantially to global mortality and disability. GBM elevates IS risk via prothrombotic mechanisms, while IS may accelerate glioma progression through ischemia-driven neuroinflammation. Identifying shared molecular mediators is essential for understanding their bidirectional pathophysiology.</div></div><div><h3>Methods</h3><div>A systems biology approach was implemented to investigate shared neurotrophic factor-related genes (NFRGs) between GBM and IS. A total of 2871 NFRGs were screened from Genecards, with Caspase-3 (CASP3) and Protein Arginine N-Methyltransferase 6 (PRMT6) identified as core regulators. Multi-omics validation included: 1) Differential expression profiling across The Cancer Genome Atlas (TCGA)-GBM and Gene Expression Omnibus (GEO) stroke datasets; 2) Prognostic stratification using Kaplan-Meier (KM) survival curves with log-rank test and Cox proportional hazards regression; 3) Immune microenvironment analysis via CIBERSORT; 4) Experimental validation in middle cerebral artery occlusion (MCAO) mice and GBM cell lines (U87MG, T98G, A172) using Real-Time Quantitative Reverse Transcription PCR (qRT-PCR), Western blot (WB), and immunofluorescence (IF).</div></div><div><h3>Results</h3><div>CASP3 and PRMT6 were significantly upregulated in both GBM and IS (<em>P</em> < 0.05). KM survival analysis with log-rank test showed that high expression of CASP3 and PRMT6 was strongly associated with poorer overall survival (OS) in GBM patients (<em>P</em> < 0.001; Hazard Ratio (HR) = 4.375, 95% Confidence Interval (CI) = 3.336–5.738 for CASP3; HR = 4.547, 95% CI = 3.429–6.029 for PRMT6). Receiver operating characteristic (ROC) analysis confirmed robust diagnostic (Area Under the Curve (AUC) > 0.7) and prognostic efficacy for both markers. IF validated their elevated expression in ischemic brain tissues of Middle Cerebral Artery Occlusion (MCAO) mice, while qRT-PCR and WB confirmed higher expression in GBM cells versus normal glial cells. Immune infiltration analysis indicated that CASP3 and PRMT6 are associated with immunosuppressive remodeling in GBM, suggesting their role as a molecular bridge between the two diseases.</div></div><div><h3>Conclusions</h3><div>Our findings identify CASP3 and PRMT6 as dual molecular mediators coordinating GBM progression and post-IS pathological processes. Targeting these genes may provide novel therapeutic avenues for preventing GBM-associated IS and improving neuro-oncological outcomes.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109283"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146186279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-28DOI: 10.1016/j.cmpb.2026.109263
Rama Krishna Thelagathoti, Chao Jiang, Dinesh S. Chandel, Wesley A. Tom, Cleo Sarmiento, Gary Krzyzanowski, Appolinaire Olou, M. Rohan Fernando
Background and Objective
Reliable detection of robust biomarkers from high-dimensional transcriptomic data remains a major challenge in computational oncology. Traditional approaches often suffer from overfitting and poor generalization due to the high dimensionality of genomic data and limited sample sizes. This study aims to identify an optimal, biologically meaningful subset of mRNA biomarkers capable of distinguishing ovarian cancer samples from healthy controls using an integrated machine learning–based feature selection framework.
Methods
We analyzed mRNA expression data encompassing approximately 63,000 transcripts from ovarian cancer and control samples derived from cell lines. A hybrid feature selection pipeline combining statistical filtering, recursive elimination, and regularization was implemented under stratified cross-validation to derive stable biomarkers. Model validation was performed using Logistic Regression, Random Forest, XGBoost, and Support Vector Machine classifiers, while experimental validation was conducted through droplet digital PCR (ddPCR). Statistical analyses included ANOVA, t-tests, and pathway enrichment.
Results
The pipeline identified 80 discriminative mRNA biomarkers with exceptionally high classification performance (accuracy = 1.00, sensitivity = 1.00, specificity = 1.00 for top models). ddPCR confirmed consistent expression patterns, with significant downregulation of ADAMTS12, FN1, and ABI3BP and overexpression of EPCAM, COX6C, and TMT1B in ovarian cancer. Pathway enrichment revealed involvement in DNA repair, RNA processing, protein translation, immune regulation, and metabolic reprogramming.
Conclusions
This hybrid feature selection framework applied to patient derived cell lines, effectively reduces dimensionality, enhances biomarker reliability, and uncovers biologically interpretable mRNA signatures associated with ovarian cancer, demonstrating potential for diagnostic and therapeutic applications.
{"title":"Detecting optimal biomarkers in ovarian cancer cells from high-dimensional mRNA expression data using machine learning","authors":"Rama Krishna Thelagathoti, Chao Jiang, Dinesh S. Chandel, Wesley A. Tom, Cleo Sarmiento, Gary Krzyzanowski, Appolinaire Olou, M. Rohan Fernando","doi":"10.1016/j.cmpb.2026.109263","DOIUrl":"10.1016/j.cmpb.2026.109263","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Reliable detection of robust biomarkers from high-dimensional transcriptomic data remains a major challenge in computational oncology. Traditional approaches often suffer from overfitting and poor generalization due to the high dimensionality of genomic data and limited sample sizes. This study aims to identify an optimal, biologically meaningful subset of mRNA biomarkers capable of distinguishing ovarian cancer samples from healthy controls using an integrated machine learning–based feature selection framework.</div></div><div><h3>Methods</h3><div>We analyzed mRNA expression data encompassing approximately 63,000 transcripts from ovarian cancer and control samples derived from cell lines. A hybrid feature selection pipeline combining statistical filtering, recursive elimination, and regularization was implemented under stratified cross-validation to derive stable biomarkers. Model validation was performed using Logistic Regression, Random Forest, XGBoost, and Support Vector Machine classifiers, while experimental validation was conducted through droplet digital PCR (ddPCR). Statistical analyses included ANOVA, <em>t</em>-tests, and pathway enrichment.</div></div><div><h3>Results</h3><div>The pipeline identified 80 discriminative mRNA biomarkers with exceptionally high classification performance (accuracy = 1.00, sensitivity = 1.00, specificity = 1.00 for top models). ddPCR confirmed consistent expression patterns, with significant downregulation of ADAMTS12, FN1, and ABI3BP and overexpression of EPCAM, COX6C, and TMT1B in ovarian cancer. Pathway enrichment revealed involvement in DNA repair, RNA processing, protein translation, immune regulation, and metabolic reprogramming.</div></div><div><h3>Conclusions</h3><div>This hybrid feature selection framework applied to patient derived cell lines, effectively reduces dimensionality, enhances biomarker reliability, and uncovers biologically interpretable mRNA signatures associated with ovarian cancer, demonstrating potential for diagnostic and therapeutic applications.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109263"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146124093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-02DOI: 10.1016/j.cmpb.2026.109267
Zekai Yu , Weihao Cheng
{"title":"Advancing the vision of “reliability metadata”: From conceptual refinement to clinical validation","authors":"Zekai Yu , Weihao Cheng","doi":"10.1016/j.cmpb.2026.109267","DOIUrl":"10.1016/j.cmpb.2026.109267","url":null,"abstract":"","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109267"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146156496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-29DOI: 10.1016/j.cmpb.2026.109266
Arsene Adjevi , Abdiwahab Mohamed Abdirashid , Faruk Aktaş , Mustafa Hikmet Bilgehan Ucar , Serdar Solak
Background and Objective:
Effective diabetes management requires continuous regulation of blood glucose in response to complex factors such as diet, activity, stress, and medication. Advances in continuous glucose monitoring and machine learning have improved short-term glucose prediction. However, preprocessing of signals like insulin, carbohydrate intake, heart rate, and activity to better capture metabolic dynamics remains underexplored. Similarly, the integration of predictive models with preventive strategies for guiding interventions is still limited.
Methods:
We propose a research-only decision-support framework combining signal preprocessing, CNN-based glucose prediction, Shapley Additive Explanations (SHAP) values attribution, and an Actor–Critic Reinforcement Learning (RL) agent. Exponential decay models preprocess inputs, a compact CNN forecasts short-term glucose levels, and SHAP values highlights the most influential input features; however, these attributions reflect associative patterns in the data and do not establish or map to causal clinical mechanisms. These SHAP-derived attributions guide the RL agent, which issues bounded one-step behavioral adjustments. Because SHAP-guided RL remains stochastic and uncertain, the proposed system is exploratory and not clinically safe, serving solely as a simulation framework.
Results:
Using the OhioT1DM dataset, the model achieved state-of-the-art RMSE across prediction horizons with a compact size of 7̃4 KB per patient and training under one minute for 1000 epochs. Over 98% of predictions fell within Clarke Error Grid Zones A and B, confirming safe 5–20 min forecasts. The preventive component corrected hyper- and hypoglycemia in 2̃5% of cases within 10 min when predictions were near 80–120 mg/dL (10 mg/dL). When deviations exceed 10 mg/dL, the RL agent is unable to fully restore blood glucose to the target range within 10 min but can bring it as close as possible to the defined interval.
Conclusions:
This study presents a significant innovation by bridging predictive accuracy, adaptability, and transparency in diabetes management. The integration of a predictive model with Reinforcement Learning (RL) guided by SHAP values, which are typically used for interpretability but here are employed in the learning process, delivers a powerful decision support framework. This approach advances the field toward next-generation, personalized digital health tools.
{"title":"Explainable reinforcement learning for glucose monitoring based on shapley value analysis","authors":"Arsene Adjevi , Abdiwahab Mohamed Abdirashid , Faruk Aktaş , Mustafa Hikmet Bilgehan Ucar , Serdar Solak","doi":"10.1016/j.cmpb.2026.109266","DOIUrl":"10.1016/j.cmpb.2026.109266","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Effective diabetes management requires continuous regulation of blood glucose in response to complex factors such as diet, activity, stress, and medication. Advances in continuous glucose monitoring and machine learning have improved short-term glucose prediction. However, preprocessing of signals like insulin, carbohydrate intake, heart rate, and activity to better capture metabolic dynamics remains underexplored. Similarly, the integration of predictive models with preventive strategies for guiding interventions is still limited.</div></div><div><h3>Methods:</h3><div>We propose a research-only decision-support framework combining signal preprocessing, CNN-based glucose prediction, Shapley Additive Explanations (SHAP) values attribution, and an Actor–Critic Reinforcement Learning (RL) agent. Exponential decay models preprocess inputs, a compact CNN forecasts short-term glucose levels, and SHAP values highlights the most influential input features; however, these attributions reflect associative patterns in the data and do not establish or map to causal clinical mechanisms. These SHAP-derived attributions guide the RL agent, which issues bounded one-step behavioral adjustments. Because SHAP-guided RL remains stochastic and uncertain, the proposed system is exploratory and not clinically safe, serving solely as a simulation framework.</div></div><div><h3>Results:</h3><div>Using the OhioT1DM dataset, the model achieved state-of-the-art RMSE across prediction horizons with a compact size of 7̃4 KB per patient and training under one minute for 1000 epochs. Over 98% of predictions fell within Clarke Error Grid Zones A and B, confirming safe 5–20 min forecasts. The preventive component corrected hyper- and hypoglycemia in 2̃5% of cases within 10 min when predictions were near 80–120 mg/dL (<span><math><mo>±</mo></math></span>10 mg/dL). When deviations exceed <span><math><mo>±</mo></math></span>10 mg/dL, the RL agent is unable to fully restore blood glucose to the target range within 10 min but can bring it as close as possible to the defined interval.</div></div><div><h3>Conclusions:</h3><div>This study presents a significant innovation by bridging predictive accuracy, adaptability, and transparency in diabetes management. The integration of a predictive model with Reinforcement Learning (RL) guided by SHAP values, which are typically used for interpretability but here are employed in the learning process, delivers a powerful decision support framework. This approach advances the field toward next-generation, personalized digital health tools.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109266"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of viral resistance can significantly reduce the effectiveness of therapy. Human immunodeficiency virus type 1 is the cause of chronic immune dysfunction, leading to the development of co-infections and serious complications. Despite worldwide progress and consolidated efforts to overcome HIV drug resistance, the development of novel approaches for rational drug therapy of HIV infection is still needed for building models with high accuracy of prediction and that can be applied for evaluation of resistance against wide variety of inhibitors. Our study is dedicated to the development of a novel computational ML-driven approach for the ternary classification of HIV protease, reverse transcriptase, and integrase sequences. Binary classification approaches naturally are not applicable to capture clinically important intermediate resistance levels, motivating the use of a ternary classification model.
Methods
For the model development we used the Self-Consistent Extreme Classifier. One-versus-rest and one-versus-one ternary approaches were applied to sequences related resistance data from Stanford University HIV Drug Resistance Database (StDB).
Results
For the final classifiers we selected the most appropriate models with 0.913 sensitivity, 0.894 specificity, 0.741 precision and 0.953 area under ROC, all values provided in average. We tested our approach in a clinical task and performed prospective validation for eight sequences of HIV protease and reverse transcriptase obtained from treatment-naive HIV-positive male patients. We performed a prediction and compared the results with the therapeutic outcome, in particular, with the viral load decline at 24 weeks.
Conclusions
The results of the prospective validation are generally consistent with the results of the therapeutic outcome and confirm the possibility of using the developed approach for the selection of the most appropriate therapeutic regimens.
背景和目的:病毒耐药性的发展会显著降低治疗的有效性。人类免疫缺陷病毒1型是慢性免疫功能障碍的原因,导致合并感染和严重并发症的发展。尽管世界范围内在克服艾滋病毒耐药性方面取得了进展和共同努力,但仍然需要开发新的方法来合理治疗艾滋病毒感染,以建立具有高预测精度的模型,并可用于评估对各种抑制剂的耐药性。我们的研究致力于开发一种新的计算机器学习驱动的方法,用于HIV蛋白酶、逆转录酶和整合酶序列的三元分类。二元分类方法自然不适用于捕获临床重要的中间抗性水平,这促使使用三元分类模型。方法:采用自洽极值分类器进行模型开发。对来自Stanford University HIV Drug resistance Database (StDB)的序列相关耐药数据应用One-versus-rest和one-versus-one三元方法。结果:对于最终的分类器,我们选择了最合适的模型,灵敏度为0.913,特异性为0.894,精度为0.741,ROC下面积为0.953,所有值均为平均值。我们在一项临床任务中测试了我们的方法,并对从初次治疗的HIV阳性男性患者中获得的8个HIV蛋白酶和逆转录酶序列进行了前瞻性验证。我们进行了预测,并将结果与治疗结果进行了比较,特别是在24周时病毒载量下降。结论:前瞻性验证的结果与治疗结果基本一致,并证实了使用所开发的方法选择最合适的治疗方案的可能性。
{"title":"A computational approach for classification of HIV drug resistance based on the self-consistent extreme classifier","authors":"L.A. Stolbov , A.V. Rudik , E.A. Stolbova , A.V. Pokrovskaya , A.B. Shemshura , D.E. Kireev , A.A. Lagunin , D.A. Filimonov , V.V. Poroikov , O.A. Tarasova","doi":"10.1016/j.cmpb.2026.109268","DOIUrl":"10.1016/j.cmpb.2026.109268","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>The development of viral resistance can significantly reduce the effectiveness of therapy. Human immunodeficiency virus type 1 is the cause of chronic immune dysfunction, leading to the development of co-infections and serious complications. Despite worldwide progress and consolidated efforts to overcome HIV drug resistance, the development of novel approaches for rational drug therapy of HIV infection is still needed for building models with high accuracy of prediction and that can be applied for evaluation of resistance against wide variety of inhibitors. Our study is dedicated to the development of a novel computational ML-driven approach for the ternary classification of HIV protease, reverse transcriptase, and integrase sequences. Binary classification approaches naturally are not applicable to capture clinically important intermediate resistance levels, motivating the use of a ternary classification model.</div></div><div><h3>Methods</h3><div>For the model development we used the Self-Consistent Extreme Classifier. One-versus-rest and one-versus-one ternary approaches were applied to sequences related resistance data from Stanford University HIV Drug Resistance Database (StDB).</div></div><div><h3>Results</h3><div>For the final classifiers we selected the most appropriate models with 0.913 sensitivity, 0.894 specificity, 0.741 precision and 0.953 area under ROC, all values provided in average. We tested our approach in a clinical task and performed prospective validation for eight sequences of HIV protease and reverse transcriptase obtained from treatment-naive HIV-positive male patients. We performed a prediction and compared the results with the therapeutic outcome, in particular, with the viral load decline at 24 weeks.</div></div><div><h3>Conclusions</h3><div>The results of the prospective validation are generally consistent with the results of the therapeutic outcome and confirm the possibility of using the developed approach for the selection of the most appropriate therapeutic regimens.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109268"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146164048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-10DOI: 10.1016/j.cmpb.2026.109284
Amit Raj , Raghvendra Gupta , Anugrah Singh
Background and Objective
Cerebral aneurysms are pathological dilations of intracranial arteries that commonly develop at arterial bifurcations. At these locations, hemodynamic forces significantly affect structural properties of the vascular walls leading to focal weakening and vessel remodeling. This study aims to evaluate the influence of wall compliance and tissue prestress on aneurysmal hemodynamics and wall mechanics using a fluid–structure interaction (FSI) framework. The effect of shear thinning of blood is also studied.
Methods
The flow of blood and its effect on the vessel walls is modelled in a patient-specific cerebral aneurysm. Physiologically realistic inflow conditions derived from PC-MRI is used as the inlet boundary condition and three-element Windkessel model is used to specify the outlet boundary condition to account for the effect of downstream vasculature. Prestress is applied to the arterial wall to mimic the in-vivo stressed state of the vessel wall. Simulations are performed using the Arbitrary–Lagrangian–Eulerian (ALE) FSI approach under different considerations of wall compliance, blood rheology, and prestress, both individually and in-combination. The computational framework is validated against analytical and numerical solutions available in the literature.
Results
Accounting for wall compliance leads to increased inflow into the aneurysm sac and a reduced pressure drop between the inlet and outlet over a cardiac cycle. In the flexible wall model, a single, stable vortex core is observed in the dome instead of the multiple vortices which are observed in case of rigid wall. Further, consideration of flexible walls results in the reduction of peak time-averaged wall shear stress (TAWSS) by ∼20%, reduces the dome area exposed to low TAWSS and regions having high oscillatory shear index (OSI). Including the prestress in model proves critical, as it reduces wall displacement up to 72% and peak tensile stress up to 83% at peak systole. Consideration of shear thinning behaviour of blood further decreases peak TAWSS by up to 25% and reduces area having low TAWSS, but has minimal effect on wall displacement and tensile stress.
Conclusions
Wall compliance, blood rheology, and prestress substantially influence aneurysmal hemodynamics and wall mechanics, with prestress having the most dominant effect in reducing wall deformation and stress.
{"title":"Patient-specific fluid-structure interaction modeling of cerebral aneurysm: influence of wall compliance, tissue prestress, and blood rheology","authors":"Amit Raj , Raghvendra Gupta , Anugrah Singh","doi":"10.1016/j.cmpb.2026.109284","DOIUrl":"10.1016/j.cmpb.2026.109284","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Cerebral aneurysms are pathological dilations of intracranial arteries that commonly develop at arterial bifurcations. At these locations, hemodynamic forces significantly affect structural properties of the vascular walls leading to focal weakening and vessel remodeling. This study aims to evaluate the influence of wall compliance and tissue prestress on aneurysmal hemodynamics and wall mechanics using a fluid–structure interaction (FSI) framework. The effect of shear thinning of blood is also studied.</div></div><div><h3>Methods</h3><div>The flow of blood and its effect on the vessel walls is modelled in a patient-specific cerebral aneurysm. Physiologically realistic inflow conditions derived from PC-MRI is used as the inlet boundary condition and three-element Windkessel model is used to specify the outlet boundary condition to account for the effect of downstream vasculature. Prestress is applied to the arterial wall to mimic the in-vivo stressed state of the vessel wall. Simulations are performed using the Arbitrary–Lagrangian–Eulerian (ALE) FSI approach under different considerations of wall compliance, blood rheology, and prestress, both individually and in-combination. The computational framework is validated against analytical and numerical solutions available in the literature.</div></div><div><h3>Results</h3><div>Accounting for wall compliance leads to increased inflow into the aneurysm sac and a reduced pressure drop between the inlet and outlet over a cardiac cycle. In the flexible wall model, a single, stable vortex core is observed in the dome instead of the multiple vortices which are observed in case of rigid wall. Further, consideration of flexible walls results in the reduction of peak time-averaged wall shear stress (TAWSS) by ∼20%, reduces the dome area exposed to low TAWSS and regions having high oscillatory shear index (OSI). Including the prestress in model proves critical, as it reduces wall displacement up to 72% and peak tensile stress up to 83% at peak systole. Consideration of shear thinning behaviour of blood further decreases peak TAWSS by up to 25% and reduces area having low TAWSS, but has minimal effect on wall displacement and tensile stress.</div></div><div><h3>Conclusions</h3><div>Wall compliance, blood rheology, and prestress substantially influence aneurysmal hemodynamics and wall mechanics, with prestress having the most dominant effect in reducing wall deformation and stress.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109284"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146186316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-24DOI: 10.1016/j.cmpb.2026.109262
Mengmeng Fan , Dakuo He , Qian Liu , Qing Liu , Feng Wang , He Li , Hao Wang , Siqi Shan , Jinghao Zhang , Yue Hou
Background and objective:
Hydroxycarboxylic acid receptor 1 (HCAR1), also known as the lactate receptor, is closely associated with tumorigenesis and cancer progression due to its aberrant activation, making it an attractive therapeutic target for cancer treatment. Accurate prediction of HCAR1 antagonists is therefore crucial for tumor immunotherapy. However, traditional drug screening suffers from high costs and suboptimal performance caused by imbalanced datasets and incomplete molecular representations, contributing to the scarcity of clinically available HCAR1 antagonists.
Methods:
A balanced HCAR1 target activity dataset was constructed using a boundary-selected negative sampling strategy. Subsequently, a multi-level graph neural network (Multi-GNN) was proposed for HCAR1 target activity prediction, integrating multiple molecular representations, including fingerprints, molecular graphs, and fragment-level features.
Results:
Experimental results demonstrate that the proposed model outperforms eight state-of-the-art methods in comparative evaluations. Furthermore, approximately ten million compounds were screened using the trained Multi-GNN model in combination with physicochemical filtering and molecular docking, yielding five candidate compounds. Finally, in vitro cAMP antagonistic activity assays identified a promising HCAR1 inhibitor with an of 22.39 .
Conclusions:
This study introduces a novel artificial intelligence-based framework for HCAR1-targeted drug discovery and highlights potential lead compounds for further development.
{"title":"HCAR1 antagonist screening based on boundary-selected negative sampling strategy and multi-level graph neural network","authors":"Mengmeng Fan , Dakuo He , Qian Liu , Qing Liu , Feng Wang , He Li , Hao Wang , Siqi Shan , Jinghao Zhang , Yue Hou","doi":"10.1016/j.cmpb.2026.109262","DOIUrl":"10.1016/j.cmpb.2026.109262","url":null,"abstract":"<div><h3>Background and objective:</h3><div>Hydroxycarboxylic acid receptor 1 (HCAR1), also known as the lactate receptor, is closely associated with tumorigenesis and cancer progression due to its aberrant activation, making it an attractive therapeutic target for cancer treatment. Accurate prediction of HCAR1 antagonists is therefore crucial for tumor immunotherapy. However, traditional drug screening suffers from high costs and suboptimal performance caused by imbalanced datasets and incomplete molecular representations, contributing to the scarcity of clinically available HCAR1 antagonists.</div></div><div><h3>Methods:</h3><div>A balanced HCAR1 target activity dataset was constructed using a boundary-selected negative sampling strategy. Subsequently, a multi-level graph neural network (Multi-GNN) was proposed for HCAR1 target activity prediction, integrating multiple molecular representations, including fingerprints, molecular graphs, and fragment-level features.</div></div><div><h3>Results:</h3><div>Experimental results demonstrate that the proposed model outperforms eight state-of-the-art methods in comparative evaluations. Furthermore, approximately ten million compounds were screened using the trained Multi-GNN model in combination with physicochemical filtering and molecular docking, yielding five candidate compounds. Finally, in vitro cAMP antagonistic activity assays identified a promising HCAR1 inhibitor with an <span><math><msub><mrow><mtext>IC</mtext></mrow><mrow><mn>50</mn></mrow></msub></math></span> of 22.39 <span><math><mrow><mi>μ</mi><mi>M</mi></mrow></math></span>.</div></div><div><h3>Conclusions:</h3><div>This study introduces a novel artificial intelligence-based framework for HCAR1-targeted drug discovery and highlights potential lead compounds for further development.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109262"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146112531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Worldwide, over 50 million people suffer from epilepsy, a neurological disorder characterised by recurrent seizures due to abnormal electrical activity in the brain. These occur as a result of sudden electric surges and the symptoms vary based on the region of the brain being affected, including brief staring spells and confusion to convulsions and loss of consciousness. Physicians typically classify seizures into four main phases: Interictal, Preictal, Ictal, and Postictal. Accurate analysis of EEG signals around seizure onset is extremely critical for timely clinical intervention. However, the current methodologies majorly utilise complex Convolutional Neural Networks (CNNs) with millions of parameters. They require high computational power, and, hence, it is difficult to deploy them in wearable devices. The core idea of this work is to develop a computationally compact architecture for seizure onset discrimination that offers potential for future integration with wearable devices.
Methods:
To achieve this, this work proposes employing a DNA-based encoding framework for Electroencephalogram (EEG) signals. Existing DNA-based compression techniques have demonstrated significant potential in reducing data complexity. Multichannel EEG signals using 23 scalp electrodes are obtained from the CHB-MIT dataset and normalised using min–max scaling. The signals are then windowed to capture temporal dependencies and transformed into integer safe magnitudes before being converted to binary. This approach then involves genetic coding-based preprocessing: genetic transcription and translation (DNA RNA Codons Amino Acids) occur. By converting EEG signal data to amino acid sequences, the proposed encoding scheme aims to capture underlying patterns in the data and provide a compact representation of temporal patterns. The encoded sequences are subsequently processed using a lightweight one-dimensional multi-level parallel CNN architecture.
Results and Conclusion:
These DNA-encoded EEG sequences are then used as input to the proposed 1D multi-level parallel CNN model, with drastically fewer parameters. After extensive testing, the proposed model achieves an accuracy of 96.22%. Additionally, the applicability of the proposed encoding framework on early seizure prediction tasks under a subject-wise protocol has been evaluated. An accuracy of 93.87% has been achieved. Overall, these findings indicate that the proposed approach provides a compact and effective representation for EEG-based seizure analysis across related onset and early prediction tasks.
{"title":"DNA-Driven EEG monitoring for rapid seizure prediction in healthcare","authors":"Khalid Ansari, Unnati Chaurasia, Himanshu Kumar Pathak, Koushlendra Kumar Singh, Jitesh Pradhan","doi":"10.1016/j.cmpb.2026.109277","DOIUrl":"10.1016/j.cmpb.2026.109277","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Worldwide, over 50 million people suffer from epilepsy, a neurological disorder characterised by recurrent seizures due to abnormal electrical activity in the brain. These occur as a result of sudden electric surges and the symptoms vary based on the region of the brain being affected, including brief staring spells and confusion to convulsions and loss of consciousness. Physicians typically classify seizures into four main phases: Interictal, Preictal, Ictal, and Postictal. Accurate analysis of EEG signals around seizure onset is extremely critical for timely clinical intervention. However, the current methodologies majorly utilise complex Convolutional Neural Networks (CNNs) with millions of parameters. They require high computational power, and, hence, it is difficult to deploy them in wearable devices. The core idea of this work is to develop a computationally compact architecture for seizure onset discrimination that offers potential for future integration with wearable devices.</div></div><div><h3>Methods:</h3><div>To achieve this, this work proposes employing a DNA-based encoding framework for Electroencephalogram (EEG) signals. Existing DNA-based compression techniques have demonstrated significant potential in reducing data complexity. Multichannel EEG signals using 23 scalp electrodes are obtained from the CHB-MIT dataset and normalised using min–max scaling. The signals are then windowed to capture temporal dependencies and transformed into integer safe magnitudes before being converted to binary. This approach then involves genetic coding-based preprocessing: genetic transcription and translation (DNA <span><math><mo>→</mo></math></span> RNA <span><math><mo>→</mo></math></span> Codons <span><math><mo>→</mo></math></span> Amino Acids) occur. By converting EEG signal data to amino acid sequences, the proposed encoding scheme aims to capture underlying patterns in the data and provide a compact representation of temporal patterns. The encoded sequences are subsequently processed using a lightweight one-dimensional multi-level parallel CNN architecture.</div></div><div><h3>Results and Conclusion:</h3><div>These DNA-encoded EEG sequences are then used as input to the proposed 1D multi-level parallel CNN model, with drastically fewer parameters. After extensive testing, the proposed model achieves an accuracy of 96.22%. Additionally, the applicability of the proposed encoding framework on early seizure prediction tasks under a subject-wise protocol has been evaluated. An accuracy of 93.87% has been achieved. Overall, these findings indicate that the proposed approach provides a compact and effective representation for EEG-based seizure analysis across related onset and early prediction tasks.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"278 ","pages":"Article 109277"},"PeriodicalIF":4.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146177268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}