Pub Date : 2026-02-16DOI: 10.1016/j.compbiomed.2026.111470
Zaibunnisa L H Malik, Pooja Raundale
Motor impairments affect approximately 86.9% of children with Autism Spectrum Disorder (ASD), often persisting into adolescence and increasing the risk of Developmental Coordination Disorder (DCD). Despite their prevalence, only 31.6% of affected individuals receive physical therapy, underscoring a critical gap in early intervention. Traditional methods for diagnosing Fine Motor Deficits (FMD) are often time-consuming and costly, necessitating the adoption of data-driven approaches. This study introduces a machine learning framework for the rapid and reliable prediction of fine motor impairments in adolescents with ASD. By integrating EEG-based neurophysiological signals, behavioral assessments, and motor coordination tests, the study evaluates five classification models-Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Random Forest, and Neural Network. Among these, Logistic Regression achieved the highest accuracy (95.84%), demonstrating strong predictive power for identifying fine motor deficits. The proposed framework enhances the efficiency of FMD screening and provides an interpretable model for potential clinical use in early ASD diagnosis.
{"title":"Predicting fine motor deficit in autism by measuring brain activities and characterizing motor impairments.","authors":"Zaibunnisa L H Malik, Pooja Raundale","doi":"10.1016/j.compbiomed.2026.111470","DOIUrl":"https://doi.org/10.1016/j.compbiomed.2026.111470","url":null,"abstract":"<p><p>Motor impairments affect approximately 86.9% of children with Autism Spectrum Disorder (ASD), often persisting into adolescence and increasing the risk of Developmental Coordination Disorder (DCD). Despite their prevalence, only 31.6% of affected individuals receive physical therapy, underscoring a critical gap in early intervention. Traditional methods for diagnosing Fine Motor Deficits (FMD) are often time-consuming and costly, necessitating the adoption of data-driven approaches. This study introduces a machine learning framework for the rapid and reliable prediction of fine motor impairments in adolescents with ASD. By integrating EEG-based neurophysiological signals, behavioral assessments, and motor coordination tests, the study evaluates five classification models-Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Random Forest, and Neural Network. Among these, Logistic Regression achieved the highest accuracy (95.84%), demonstrating strong predictive power for identifying fine motor deficits. The proposed framework enhances the efficiency of FMD screening and provides an interpretable model for potential clinical use in early ASD diagnosis.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"111470"},"PeriodicalIF":6.3,"publicationDate":"2026-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146212409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-15Epub Date: 2026-01-19DOI: 10.1016/j.compbiomed.2026.111493
Mattia Perpenti, Federico Mento, Giovanni Pierro, Alessandro Perrotta, Tiziano Perrone, Andrea Smargiassi, Riccardo Inchingolo, Libertario Demi
{"title":"Corrigendum to \"Fully automated quantitative lung ultrasound spectroscopy for the differential diagnosis of lung diseases: The first multicenter in-vivo clinical study\" [Comput. Biol. Med. (200), 1 January 2026, 111365].","authors":"Mattia Perpenti, Federico Mento, Giovanni Pierro, Alessandro Perrotta, Tiziano Perrone, Andrea Smargiassi, Riccardo Inchingolo, Libertario Demi","doi":"10.1016/j.compbiomed.2026.111493","DOIUrl":"10.1016/j.compbiomed.2026.111493","url":null,"abstract":"","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":" ","pages":"111493"},"PeriodicalIF":6.3,"publicationDate":"2026-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146008853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-15Epub Date: 2026-01-19DOI: 10.1016/j.compbiomed.2026.111483
Wanus Srimaharaj
{"title":"Corrigendum to \"Brain dysfunction assessment in Alzheimer's disease: A phase-space projection and interactive signal decomposition framework\" [Comput. Biol. Med. (2026) 111440 201].","authors":"Wanus Srimaharaj","doi":"10.1016/j.compbiomed.2026.111483","DOIUrl":"10.1016/j.compbiomed.2026.111483","url":null,"abstract":"","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":" ","pages":"111483"},"PeriodicalIF":6.3,"publicationDate":"2026-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-14DOI: 10.1016/j.compbiomed.2026.111546
Hanwen Ju , Joel A. Dubin
Sepsis remains one of the leading causes of death worldwide, and despite extensive research, uncertainties persist regarding its treatment outcomes due to the diversity of the condition and characteristics across patients. Identifying subpopulations of sepsis patients with distinct clinical behaviors can be instrumental in developing more targeted and effective interventions. In this study, we build on previous work that applied clustering techniques to the large cohort single-hospital MIMIC-III intensive care unit (ICU) database by extending the analysis to the larger cohort multi-hospital eICU database. We employ multiple-dimensional reduction methods such as t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Variational Autoencoders (VAE) in combination with density-based clustering (DBSCAN) and use Self-Organizing Maps (SOM) as an extra topological validation. Our approach was able to uncover recognizable subpopulations of sepsis with some shared characteristics, both validating many results from the previous MIMIC-III analysis and identifying new results that appear indicative of the more heterogeneous eICU database.
{"title":"Unsupervised identification of sepsis subpopulations in the eICU database: A multi-method clustering approach with validation","authors":"Hanwen Ju , Joel A. Dubin","doi":"10.1016/j.compbiomed.2026.111546","DOIUrl":"10.1016/j.compbiomed.2026.111546","url":null,"abstract":"<div><div>Sepsis remains one of the leading causes of death worldwide, and despite extensive research, uncertainties persist regarding its treatment outcomes due to the diversity of the condition and characteristics across patients. Identifying subpopulations of sepsis patients with distinct clinical behaviors can be instrumental in developing more targeted and effective interventions. In this study, we build on previous work that applied clustering techniques to the large cohort single-hospital MIMIC-III intensive care unit (ICU) database by extending the analysis to the larger cohort multi-hospital eICU database. We employ multiple-dimensional reduction methods such as t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Variational Autoencoders (VAE) in combination with density-based clustering (DBSCAN) and use Self-Organizing Maps (SOM) as an extra topological validation. Our approach was able to uncover recognizable subpopulations of sepsis with some shared characteristics, both validating many results from the previous MIMIC-III analysis and identifying new results that appear indicative of the more heterogeneous eICU database.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111546"},"PeriodicalIF":6.3,"publicationDate":"2026-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-13DOI: 10.1016/j.compbiomed.2026.111550
Andrielly H.S. Costa , Eduardo M. Gaieta , Aline O. Albuquerque , Julia S. Souza , Diego S. Almeida , Jean V. Sampaio , Patrick England , Geraldo R. Sartori , João H.M. Silva
Galectin-3 binding protein (Gal-3BP) is a clinically relevant oncology target, with overexpression associated with poor prognosis across multiple tumor types. However, its therapeutic exploration has been hindered by extensive glycosylation, conformational heterogeneity, and context-dependent oligomerization, which restrict epitope accessibility. Antibody-based strategies remain promising for targeting such complex proteins, yet their development is costly and experimentally demanding. To address these challenges, we established an integrative in-silico workflow tailored to the specific structural and biophysical features of Gal-3BP combining validated methodologies of structural prediction, molecular dynamics (MD) simulations, and antibody engineering. By mapping Gal-3BP across oligomeric states and characterizing its N-glycan conformational diversity, we identified two glycan-free epitopes within the BACK domain, termed E1 and E2. Scaffold selection using 3D Zernike descriptors–based similarity search identified BDBV-43 as a compatible candidate for E1. For E2, which lacked similarity-based matches, naïve repertoire mining retrieved the unmatured antibody E2-Ab1, broadening the set of viable templates. Engineering approaches included point mutations in BDBV-43 and full CDR swapping in E2-Ab1. Iterative refinement yielded variants with improved interaction profiles and robust stability during heated MD simulations. Furthermore, Gaussian accelerated MD (GaMD) revealed reorganized conformational landscapes together with modest shifts in the underlying free-energy profiles for the engineered antibodies relative to their native scaffolds, in line with the interpretative limits of GaMD reweighting. Collectively, this study positions Gal-3BP as a tractable therapeutic target and presents optimized antibody candidates capable of engaging epitopes minimally affected by glycan shielding, illustrating the potential of integrative computational pipelines for antibody design against structurally complex proteins.
{"title":"Overcoming structural complexity in Galectin-3BP through an integrative computational antibody design workflow","authors":"Andrielly H.S. Costa , Eduardo M. Gaieta , Aline O. Albuquerque , Julia S. Souza , Diego S. Almeida , Jean V. Sampaio , Patrick England , Geraldo R. Sartori , João H.M. Silva","doi":"10.1016/j.compbiomed.2026.111550","DOIUrl":"10.1016/j.compbiomed.2026.111550","url":null,"abstract":"<div><div>Galectin-3 binding protein (Gal-3BP) is a clinically relevant oncology target, with overexpression associated with poor prognosis across multiple tumor types. However, its therapeutic exploration has been hindered by extensive glycosylation, conformational heterogeneity, and context-dependent oligomerization, which restrict epitope accessibility. Antibody-based strategies remain promising for targeting such complex proteins, yet their development is costly and experimentally demanding. To address these challenges, we established an integrative <em>in-silico</em> workflow tailored to the specific structural and biophysical features of Gal-3BP combining validated methodologies of structural prediction, molecular dynamics (MD) simulations, and antibody engineering. By mapping Gal-3BP across oligomeric states and characterizing its N-glycan conformational diversity, we identified two glycan-free epitopes within the BACK domain, termed E1 and E2. Scaffold selection using 3D Zernike descriptors–based similarity search identified BDBV-43 as a compatible candidate for E1. For E2, which lacked similarity-based matches, naïve repertoire mining retrieved the unmatured antibody E2-Ab1, broadening the set of viable templates. Engineering approaches included point mutations in BDBV-43 and full CDR swapping in E2-Ab1. Iterative refinement yielded variants with improved interaction profiles and robust stability during heated MD simulations. Furthermore, Gaussian accelerated MD (GaMD) revealed reorganized conformational landscapes together with modest shifts in the underlying free-energy profiles for the engineered antibodies relative to their native scaffolds, in line with the interpretative limits of GaMD reweighting. Collectively, this study positions Gal-3BP as a tractable therapeutic target and presents optimized antibody candidates capable of engaging epitopes minimally affected by glycan shielding, illustrating the potential of integrative computational pipelines for antibody design against structurally complex proteins.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111550"},"PeriodicalIF":6.3,"publicationDate":"2026-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-13DOI: 10.1016/j.compbiomed.2026.111548
Sampath Rapuri , Kirby Gong , Carl Harris , Robert D. Stevens
Background
Pulmonary embolism (PE) is a leading cause of preventable death, yet statistical prediction models have shown inconsistent validity. Our primary objective was to determine if a machine learning model trained with data routinely collected in clinical care can successfully identify acute PE in critically ill patients.
Methods
Leveraging two multicenter datasets acquired nationally (development cohort) and within the Johns Hopkins Health System (external validation cohort), we trained machine learning models with features extracted from demographics, comorbidities, physiologic and laboratory data available following intensive care unit (ICU) admission. The primary endpoint was the identification of acute PE during ICU admission. Model performance was contrasted with two benchmark PE risk scores.
Findings
PE was diagnosed in 2647 of 164,383 (1.61%) and 754 of 64,923 admissions (1.16%) in the development and external validation datasets respectively. Using data from the first 48 h after ICU admission, the mean (95% CI) discrimination measured by area under the receiver characteristic curve (AUROC) was 0.829 (0.808–0.852), 0.704 (0.681–0.727), and 0.667 (0.653–0.681) for our logistic regression machine learning model and for the two benchmark scores, respectively; mean area under the precision recall curve was 0.150 (0.138–0.162), 0.080 (0.071–0.089), and 0.081 (0.071–0.091), respectively. Discrimination was maintained in the external validation dataset with an AUROC of 0.819 (0.802–0.836).
Interpretation
Findings indicate that PE can be detected accurately in ICU patients using routinely collected clinical data. The machine learning model successfully validated and outperformed existing benchmark risk scores. Such a model could become a valuable tool for assessing the likelihood of PE among critically ill patients.
{"title":"A machine learning model to identify pulmonary embolism in patients admitted to intensive care","authors":"Sampath Rapuri , Kirby Gong , Carl Harris , Robert D. Stevens","doi":"10.1016/j.compbiomed.2026.111548","DOIUrl":"10.1016/j.compbiomed.2026.111548","url":null,"abstract":"<div><h3>Background</h3><div>Pulmonary embolism (PE) is a leading cause of preventable death, yet statistical prediction models have shown inconsistent validity. Our primary objective was to determine if a machine learning model trained with data routinely collected in clinical care can successfully identify acute PE in critically ill patients.</div></div><div><h3>Methods</h3><div>Leveraging two multicenter datasets acquired nationally (development cohort) and within the Johns Hopkins Health System (external validation cohort), we trained machine learning models with features extracted from demographics, comorbidities, physiologic and laboratory data available following intensive care unit (ICU) admission. The primary endpoint was the identification of acute PE during ICU admission. Model performance was contrasted with two benchmark PE risk scores.</div></div><div><h3>Findings</h3><div>PE was diagnosed in 2647 of 164,383 (1.61%) and 754 of 64,923 admissions (1.16%) in the development and external validation datasets respectively. Using data from the first 48 h after ICU admission, the mean (95% CI) discrimination measured by area under the receiver characteristic curve (AUROC) was 0.829 (0.808–0.852), 0.704 (0.681–0.727), and 0.667 (0.653–0.681) for our logistic regression machine learning model and for the two benchmark scores, respectively; mean area under the precision recall curve was 0.150 (0.138–0.162), 0.080 (0.071–0.089), and 0.081 (0.071–0.091), respectively. Discrimination was maintained in the external validation dataset with an AUROC of 0.819 (0.802–0.836).</div></div><div><h3>Interpretation</h3><div>Findings indicate that PE can be detected accurately in ICU patients using routinely collected clinical data. The machine learning model successfully validated and outperformed existing benchmark risk scores. Such a model could become a valuable tool for assessing the likelihood of PE among critically ill patients.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111548"},"PeriodicalIF":6.3,"publicationDate":"2026-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-12DOI: 10.1016/j.compbiomed.2026.111551
Jonghyun Hong , Jungmin Koh , Jinyoung Kim , Hyunchan Ryu , Dahye Lee , Hyun Bin Kwon , Byunghun Choi , Heesu Park , Kwang Suk Park , Heenam Yoon
Purpose
Sleep posture is associated with various physiological indicators and significantly influences sleep health and quality. Although several methods for posture estimation have been proposed, most have been evaluated using data from controlled laboratory environments. This study proposes a method for determining sleep posture in real-world settings using pressure sensor data.
Methods
The approach was developed based on data collected from 22 participants in a laboratory setting using a 7 × 14 array of force-sensitive resistors (FSR). We employed a support vector machine to classify four sleep postures—supine, left-lateral, right-lateral, and prone—based on six extracted features related to area, curvature, and row length ratio. The algorithm was subsequently evaluated using FSR data recorded from ten participants sleeping freely in their home environments.
Results
The performance results demonstrated an accuracy of 78.1% and a Cohen's kappa of 0.71 for the laboratory data. When applied to the home-environment data, the method achieved an accuracy of 86.1% and a Cohen's kappa of 0.76 for the classification of the four sleep postures.
Conclusion
These findings indicate that the model trained in a laboratory setting maintained high performance in real-world conditions, supporting the feasibility of implementing sleep monitoring technologies in daily life and clinical contexts. This study contributes to the development of noninvasive, long-term sleep monitoring systems and highlights the potential for future clinical applications in embedded systems and hospital environments through the use of feature-based models with high explainability.
{"title":"Unobtrusive sleep posture estimation using pressure sensor in home sleep","authors":"Jonghyun Hong , Jungmin Koh , Jinyoung Kim , Hyunchan Ryu , Dahye Lee , Hyun Bin Kwon , Byunghun Choi , Heesu Park , Kwang Suk Park , Heenam Yoon","doi":"10.1016/j.compbiomed.2026.111551","DOIUrl":"10.1016/j.compbiomed.2026.111551","url":null,"abstract":"<div><h3>Purpose</h3><div>Sleep posture is associated with various physiological indicators and significantly influences sleep health and quality. Although several methods for posture estimation have been proposed, most have been evaluated using data from controlled laboratory environments. This study proposes a method for determining sleep posture in real-world settings using pressure sensor data.</div></div><div><h3>Methods</h3><div>The approach was developed based on data collected from 22 participants in a laboratory setting using a 7 × 14 array of force-sensitive resistors (FSR). We employed a support vector machine to classify four sleep postures—supine, left-lateral, right-lateral, and prone—based on six extracted features related to area, curvature, and row length ratio. The algorithm was subsequently evaluated using FSR data recorded from ten participants sleeping freely in their home environments.</div></div><div><h3>Results</h3><div>The performance results demonstrated an accuracy of 78.1% and a Cohen's kappa of 0.71 for the laboratory data. When applied to the home-environment data, the method achieved an accuracy of 86.1% and a Cohen's kappa of 0.76 for the classification of the four sleep postures.</div></div><div><h3>Conclusion</h3><div>These findings indicate that the model trained in a laboratory setting maintained high performance in real-world conditions, supporting the feasibility of implementing sleep monitoring technologies in daily life and clinical contexts. This study contributes to the development of noninvasive, long-term sleep monitoring systems and highlights the potential for future clinical applications in embedded systems and hospital environments through the use of feature-based models with high explainability.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111551"},"PeriodicalIF":6.3,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-11DOI: 10.1016/j.compbiomed.2026.111547
Mark Germaine , Yitayeh Belsti , Amy O'Higgins , Brendan Egan , Helena Teede , Graham Healy , Joanne Enticott
Background
Although many risk prediction models have been developed, very few undergo external validation, primarily due to issues with data access. Therefore, we implemented a reciprocal model-exchange approach to facilitate external validation and demonstrate its use with gestational diabetes mellitus (GDM) prediction models.
Objective
To assess the robustness and generalisability of two independently developed GDM risk prediction models using a reciprocal model-exchange framework.
Methods
Two independently developed GDM risk prediction models were externally validated using a reciprocal model-exchange. The saved model's corresponding variable types and data pre-processor were exchanged. The Monash CatBoost model was validated using Irish data at Dublin City University (DCU), and the DCU logistic-regression GDM model was validated using Australian data at Monash University. Performance was assessed using discrimination, calibration and decision curve analysis. Model fairness was assessed.
Results
The prevalence of GDM was 21.1% in the Australian cohort and 11.7% in the Irish cohort. The Monash model's AUC dropped from 0.93 to 0.77, while the DCU model's AUC fell from 0.82 to 0.69. Calibration estimates confirmed systematic risk misestimation; each model tends to over or under-predict GDM probabilities outside its training domain, with calibration-in-the-large of −0.573 for the Monash model and 0.17 for the DCU model; slopes were 1.278 and 0.55 respectively. Both models showed performance variability across ethnic groups, with lower performance for Southeast/Northeast Asians and both performed better with increasing parity and among women without a prior GDM diagnosis.
Conclusions
Each model's performance decreased upon external validation, and the fairness evaluations on the different sub-categories (ethnicities; parity and previous GDM) provided evidence on the areas to be addressed in model recalibration/updating before deployment can be progressed. This reciprocal model-exchange approach provides a solution to facilitating external validations, which are notably lacking in the current literature but are necessary to advance the risk prediction field.
{"title":"External validation of GDM risk prediction models using a machine learning reciprocal model-exchange framework","authors":"Mark Germaine , Yitayeh Belsti , Amy O'Higgins , Brendan Egan , Helena Teede , Graham Healy , Joanne Enticott","doi":"10.1016/j.compbiomed.2026.111547","DOIUrl":"10.1016/j.compbiomed.2026.111547","url":null,"abstract":"<div><h3>Background</h3><div>Although many risk prediction models have been developed, very few undergo external validation, primarily due to issues with data access. Therefore, we implemented a reciprocal model-exchange approach to facilitate external validation and demonstrate its use with gestational diabetes mellitus (GDM) prediction models.</div></div><div><h3>Objective</h3><div>To assess the robustness and generalisability of two independently developed GDM risk prediction models using a reciprocal model-exchange framework.</div></div><div><h3>Methods</h3><div>Two independently developed GDM risk prediction models were externally validated using a reciprocal model-exchange. The saved model's corresponding variable types and data pre-processor were exchanged. The Monash CatBoost model was validated using Irish data at Dublin City University (DCU), and the DCU logistic-regression GDM model was validated using Australian data at Monash University. Performance was assessed using discrimination, calibration and decision curve analysis. Model fairness was assessed.</div></div><div><h3>Results</h3><div>The prevalence of GDM was 21.1% in the Australian cohort and 11.7% in the Irish cohort. The Monash model's AUC dropped from 0.93 to 0.77, while the DCU model's AUC fell from 0.82 to 0.69. Calibration estimates confirmed systematic risk misestimation; each model tends to over or under-predict GDM probabilities outside its training domain, with calibration-in-the-large of −0.573 for the Monash model and 0.17 for the DCU model; slopes were 1.278 and 0.55 respectively. Both models showed performance variability across ethnic groups, with lower performance for Southeast/Northeast Asians and both performed better with increasing parity and among women without a prior GDM diagnosis.</div></div><div><h3>Conclusions</h3><div>Each model's performance decreased upon external validation, and the fairness evaluations on the different sub-categories (ethnicities; parity and previous GDM) provided evidence on the areas to be addressed in model recalibration/updating before deployment can be progressed. This reciprocal model-exchange approach provides a solution to facilitating external validations, which are notably lacking in the current literature but are necessary to advance the risk prediction field.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111547"},"PeriodicalIF":6.3,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-11DOI: 10.1016/j.compbiomed.2026.111530
Komal Tanwar , Viney Kumar , Manish Dev Shrimali , Jai Prakash Tripathi
Misinformation about vaccination poses a significant public health threat by reducing vaccination rates and increasing disease burden. Understanding population heterogeneity can aid in recognizing and mitigating the effects of such misinformation, especially when vaccine effectiveness is low. Our research quantifies the impact of misinformation on vaccination uptake and explores its effects in heterogeneous versus homogeneous populations. We employed a dual approach combining compartmental modeling and complex network analysis to examine how various epidemiological parameters influence disease spread and vaccination behaviour. Our results indicate that misinformation significantly lowers vaccination rates, particularly in homogeneous populations, while heterogeneous populations demonstrate greater resilience. Among network topologies, small-world networks achieve higher vaccination rates under varying vaccine efficacies, whereas scale-free networks experience reduced vaccine coverage with higher misinformation amplification. Notably, cumulative infection remains independent of the disease transmission rate when the vaccine is partially effective. In small-world networks, cumulative infection shows high stochasticity across vaccination rates and misinformation parameters, while cumulative vaccination is highest with higher vaccination rates and lower misinformation coefficients. Public health efforts should prioritize addressing misinformation to control disease spread, particularly in homogeneous populations and scale-free networks, where its impact is more severe. Additionally, our model demonstrates strong performance on real-world contact networks, capturing how rapid misinformation spread and limited vaccine efficacy can severely hinder vaccination uptake and accelerate infection rates. Building resilience by fostering diverse community networks and promoting reliable vaccine information can boost vaccination rates. Furthermore, focusing public health campaigns on small-world networks may result in higher vaccine uptake, even when vaccine efficacy varies. These insights can help public health policymakers design effective vaccination strategies that consider population heterogeneity.
{"title":"Unraveling vaccination behavior under misinformation in homogeneous and heterogeneous populations via integrated dynamical and network models","authors":"Komal Tanwar , Viney Kumar , Manish Dev Shrimali , Jai Prakash Tripathi","doi":"10.1016/j.compbiomed.2026.111530","DOIUrl":"10.1016/j.compbiomed.2026.111530","url":null,"abstract":"<div><div>Misinformation about vaccination poses a significant public health threat by reducing vaccination rates and increasing disease burden. Understanding population heterogeneity can aid in recognizing and mitigating the effects of such misinformation, especially when vaccine effectiveness is low. Our research quantifies the impact of misinformation on vaccination uptake and explores its effects in heterogeneous versus homogeneous populations. We employed a dual approach combining compartmental modeling and complex network analysis to examine how various epidemiological parameters influence disease spread and vaccination behaviour. Our results indicate that misinformation significantly lowers vaccination rates, particularly in homogeneous populations, while heterogeneous populations demonstrate greater resilience. Among network topologies, small-world networks achieve higher vaccination rates under varying vaccine efficacies, whereas scale-free networks experience reduced vaccine coverage with higher misinformation amplification. Notably, cumulative infection remains independent of the disease transmission rate when the vaccine is partially effective. In small-world networks, cumulative infection shows high stochasticity across vaccination rates and misinformation parameters, while cumulative vaccination is highest with higher vaccination rates and lower misinformation coefficients. Public health efforts should prioritize addressing misinformation to control disease spread, particularly in homogeneous populations and scale-free networks, where its impact is more severe. Additionally, our model demonstrates strong performance on real-world contact networks, capturing how rapid misinformation spread and limited vaccine efficacy can severely hinder vaccination uptake and accelerate infection rates. Building resilience by fostering diverse community networks and promoting reliable vaccine information can boost vaccination rates. Furthermore, focusing public health campaigns on small-world networks may result in higher vaccine uptake, even when vaccine efficacy varies. These insights can help public health policymakers design effective vaccination strategies that consider population heterogeneity.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111530"},"PeriodicalIF":6.3,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146171749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-10DOI: 10.1016/j.compbiomed.2026.111549
Krish Chaudhary , Narendra N. Khanna , Pankaj K. Jain , Rajesh Singh , Laura E. Mantella , Amer M. Johri , Gavino Faa , Mohamed Abbas , John R. Laird , Mustafa Al-Maini , Esma R. Isenovic , Luca Saba , Jasjit S. Suri
Background and motivation
Classifying diseases like heart problems using gene expression data depends on selecting important genes. Traditional machine learning (ML) often uses simple feature selection (FS) techniques, which can limit accuracy. In our research, we combine deep learning (DL) with gene-focused methods like differential expression analysis (DEA) to improve classification performance significantly.
Method
We thoroughly and rigorously evaluated ML and DL classifiers using two gene expression datasets (GSE36961 and GSE57345). We tested four hypotheses using feature selection methods such as chi-square, DEA. We applied principal component analysis (PCA) to reduce the number of features. To ensure the reliability of our findings, we applied k-fold cross-validation, hyperparameter tuning, block effect analysis, and assessed data augmentation and generalization. Statistical tests, including paired t-test and Mann–Whitney U test, and Wilcoxon signed-rank test were performed to compare model performances rigorously.
Results
Our experiments on two gene expression datasets (GSE36961, GSE57345) not only confirmed all four hypotheses (H1, H2, H3, and H4) but also revealed significant performance improvements. For H1, without FS, DL outperformed ML models by a substantial margin. For H2, with FS, DL outperformed ML models by a significant percentage. In H3, ML with FS improved over ML without FS by a considerable margin. For H4, DL with FS outperformed DL without FS by a noticeable percentage. Among FS methods, DEA consistently yielded the best results for both ML and DL, further underlining the significance of our findings.
Conclusions
Combining DL with biological feature selection, especially DEA, improves gene expression classification and enables gene ranking and biomarker identification. This integrative approach balances modeling power with biological relevance, providing a reproducible framework for robust biomarker-based classification.
{"title":"Identification of high-risk genes and classification of acute myocardial infarction patients utilizing deep learning in a restricted cohort","authors":"Krish Chaudhary , Narendra N. Khanna , Pankaj K. Jain , Rajesh Singh , Laura E. Mantella , Amer M. Johri , Gavino Faa , Mohamed Abbas , John R. Laird , Mustafa Al-Maini , Esma R. Isenovic , Luca Saba , Jasjit S. Suri","doi":"10.1016/j.compbiomed.2026.111549","DOIUrl":"10.1016/j.compbiomed.2026.111549","url":null,"abstract":"<div><h3>Background and motivation</h3><div>Classifying diseases like heart problems using gene expression data depends on selecting important genes. Traditional machine learning (ML) often uses simple feature selection (FS) techniques, which can limit accuracy. In our research, we combine deep learning (DL) with gene-focused methods like differential expression analysis (DEA) to improve classification performance significantly.</div></div><div><h3>Method</h3><div>We thoroughly and rigorously evaluated ML and DL classifiers using two gene expression datasets (GSE36961 and GSE57345). We tested four hypotheses using feature selection methods such as chi-square, DEA. We applied principal component analysis (PCA) to reduce the number of features. To ensure the reliability of our findings, we applied k-fold cross-validation, hyperparameter tuning, block effect analysis, and assessed data augmentation and generalization. Statistical tests, including paired <em>t</em>-test and Mann–Whitney <em>U</em> test, and Wilcoxon signed-rank test were performed to compare model performances rigorously.</div></div><div><h3>Results</h3><div>Our experiments on two gene expression datasets (GSE36961, GSE57345) not only confirmed all four hypotheses (H1, H2, H3, and H4) but also revealed significant performance improvements. For H1, without FS, DL outperformed ML models by a substantial margin. For H2, with FS, DL outperformed ML models by a significant percentage. In H3, ML with FS improved over ML without FS by a considerable margin. For H4, DL with FS outperformed DL without FS by a noticeable percentage. Among FS methods, DEA consistently yielded the best results for both ML and DL, further underlining the significance of our findings.</div></div><div><h3>Conclusions</h3><div>Combining DL with biological feature selection, especially DEA, improves gene expression classification and enables gene ranking and biomarker identification. This integrative approach balances modeling power with biological relevance, providing a reproducible framework for robust biomarker-based classification.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"204 ","pages":"Article 111549"},"PeriodicalIF":6.3,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146164633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}