Pub Date : 2025-03-12DOI: 10.1016/j.compbiomed.2025.109925
Mohan Timilsina , Samuele Buosi , Muhammad Asif Razzaq , Rafiqul Haque , Conor Judge , Edward Curry
The lightning development of artificial intelligence (AI) has revolutionized healthcare, helping significant improvements in various applications. This paper provides a comprehensive review of foundation models in healthcare, highlighting their transformative potential in areas such as diagnostics, personalized treatment, and operational efficiency. We argue the key capabilities of these models, including their ability to process diverse data types such as medical images, clinical notes, and structured health records. Regardless their assurance, difficulties remain, including data privacy concerns, bias in AI algorithms, and the need for extensive computational resources. Our analysis identifies emerging trends and future directions, emphasizing the importance of ethical AI deployment, improved interoperability over healthcare systems, and the development of more robust, domain-specific models. Future research should focus on enhancing model interpretability, ensuring equitable access, and fostering collaboration between AI developers and healthcare professionals to maximize the advantages of these technologies.
{"title":"Harmonizing foundation models in healthcare: A comprehensive survey of their roles, relationships, and impact in artificial intelligence’s advancing terrain","authors":"Mohan Timilsina , Samuele Buosi , Muhammad Asif Razzaq , Rafiqul Haque , Conor Judge , Edward Curry","doi":"10.1016/j.compbiomed.2025.109925","DOIUrl":"10.1016/j.compbiomed.2025.109925","url":null,"abstract":"<div><div>The lightning development of artificial intelligence (AI) has revolutionized healthcare, helping significant improvements in various applications. This paper provides a comprehensive review of foundation models in healthcare, highlighting their transformative potential in areas such as diagnostics, personalized treatment, and operational efficiency. We argue the key capabilities of these models, including their ability to process diverse data types such as medical images, clinical notes, and structured health records. Regardless their assurance, difficulties remain, including data privacy concerns, bias in AI algorithms, and the need for extensive computational resources. Our analysis identifies emerging trends and future directions, emphasizing the importance of ethical AI deployment, improved interoperability over healthcare systems, and the development of more robust, domain-specific models. Future research should focus on enhancing model interpretability, ensuring equitable access, and fostering collaboration between AI developers and healthcare professionals to maximize the advantages of these technologies.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109925"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1016/j.compbiomed.2025.109896
Yue Zhao , Lanying Zhu , Wendi Wang , Longwei Lv , Qiang Li , Yang Liu , Jiang Xi , Chun Yi
With the ongoing advancement of digital technology, oral medicine transitions from traditional diagnostics to computer-assisted diagnosis and treatment. Identifying dental implants in patients without records is complex and time-consuming. Accurate identification of dental implants is crucial for ensuring the sustainability and reliability of implant treatment, particularly in cases where patients lack available medical records. In this paper, we propose a multi-task fine-grained CBCT dental implant classification and segmentation method using deep learning, called MFPT-Net.This method, based on progressive training with multiscale feature extraction and enhancement, can differentiate minor implant features and similar features that are easily confused, such as implant threads. It addresses the problem of large intra-class differences and small inter-class differences of implants, achieving automatic, synchronized classification and segmentation of implant systems in CBCT images. In this paper, 437 CBCT sequences with 723 dental implants, acquired from three different centers, are included in our dataset. This dataset is the first instance of utilizing such a comprehensive collection of data for CBCT analysis. Our method achieved a satisfying classification result with accuracy of 92.98%, average precision of 93.15%, average recall of 93.31%, and average F1 score of 93.18%, which exceeded the second-best model by nearly 10%. Moreover, our segmentation Dice similarity coefficient reached 98.04%, which is significantly better than the current state-of-the-art method. External clinical validation with 252 implants proved our model’s clinical feasibility. The result demonstrates that our proposed method could assist dentists with dental implant classification and segmentation in CBCT images, enhancing efficiency and accuracy in clinical practice.
{"title":"Progressive multi-task learning for fine-grained dental implant classification and segmentation in CBCT image","authors":"Yue Zhao , Lanying Zhu , Wendi Wang , Longwei Lv , Qiang Li , Yang Liu , Jiang Xi , Chun Yi","doi":"10.1016/j.compbiomed.2025.109896","DOIUrl":"10.1016/j.compbiomed.2025.109896","url":null,"abstract":"<div><div>With the ongoing advancement of digital technology, oral medicine transitions from traditional diagnostics to computer-assisted diagnosis and treatment. Identifying dental implants in patients without records is complex and time-consuming. Accurate identification of dental implants is crucial for ensuring the sustainability and reliability of implant treatment, particularly in cases where patients lack available medical records. In this paper, we propose a multi-task fine-grained CBCT dental implant classification and segmentation method using deep learning, called MFPT-Net.This method, based on progressive training with multiscale feature extraction and enhancement, can differentiate minor implant features and similar features that are easily confused, such as implant threads. It addresses the problem of large intra-class differences and small inter-class differences of implants, achieving automatic, synchronized classification and segmentation of implant systems in CBCT images. In this paper, 437 CBCT sequences with 723 dental implants, acquired from three different centers, are included in our dataset. This dataset is the first instance of utilizing such a comprehensive collection of data for CBCT analysis. Our method achieved a satisfying classification result with accuracy of 92.98%, average precision of 93.15%, average recall of 93.31%, and average F1 score of 93.18%, which exceeded the second-best model by nearly 10%. Moreover, our segmentation Dice similarity coefficient reached 98.04%, which is significantly better than the current state-of-the-art method. External clinical validation with 252 implants proved our model’s clinical feasibility. The result demonstrates that our proposed method could assist dentists with dental implant classification and segmentation in CBCT images, enhancing efficiency and accuracy in clinical practice.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109896"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1016/j.compbiomed.2025.109986
Yinjun Zhang
Facial Micro-Expression Recognition (FER) presents challenges due to individual variations in emotional intensity and the complexity of feature extraction. While apex frames offer valuable emotional information, their precise role in FER remains unclear. Low-resolution facial images further degrade performance compared to high-resolution (HR) images. Existing methods, including super-resolution and convolutional neural networks, yield only moderate results. This work proposes a deep coupled AlexNet (DCAlexNet) model with a trunk network trained on multi-resolution images to extract discriminative features and a branch network for resolution-specific mapping between HR and low-resolution (LR) images. By integrating global and local facial information, DCAlexNet enhances micro-expression recognition while filtering irrelevant facial regions. The evaluations on FER2013, BU-3DFE, and Oulu-CASIA datasets demonstrate superior performance, achieving 98.3 % accuracy on FER2013, 97.2 % on BU-3DFE, and 96 % on Oulu-CASIA, with improved RMSE, RAE, and processing times.
{"title":"DCAlexNet: Deep coupled AlexNet for micro facial expression recognition based on double face images","authors":"Yinjun Zhang","doi":"10.1016/j.compbiomed.2025.109986","DOIUrl":"10.1016/j.compbiomed.2025.109986","url":null,"abstract":"<div><div>Facial Micro-Expression Recognition (FER) presents challenges due to individual variations in emotional intensity and the complexity of feature extraction. While apex frames offer valuable emotional information, their precise role in FER remains unclear. Low-resolution facial images further degrade performance compared to high-resolution (HR) images. Existing methods, including super-resolution and convolutional neural networks, yield only moderate results. This work proposes a deep coupled AlexNet (DCAlexNet) model with a trunk network trained on multi-resolution images to extract discriminative features and a branch network for resolution-specific mapping between HR and low-resolution (LR) images. By integrating global and local facial information, DCAlexNet enhances micro-expression recognition while filtering irrelevant facial regions. The evaluations on FER2013, BU-3DFE, and Oulu-CASIA datasets demonstrate superior performance, achieving 98.3 % accuracy on FER2013, 97.2 % on BU-3DFE, and 96 % on Oulu-CASIA, with improved RMSE, RAE, and processing times.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109986"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1016/j.compbiomed.2025.109987
Naif Alsharabi, Abdulaziz Alayba, Gharbi Alshammari, Mohammad Alsaffar, Amr Jadi
The Medical Internet of Things (MIoTs) encompasses compact, energy-efficient wireless sensor devices designed to monitor patients' body outcomes. Healthcare networks provide constant data monitoring, enabling patients to live independently. Despite advancements in MIoTs, critical issues persist that can affect the Quality of Service (QoS) in the network. The wearable IoT module collects data and stores it on cloud servers, making it vulnerable to privacy breaches and attacks by unauthorized users. To address these challenges, we propose an end-to-end secure remote healthcare framework called the Four Tier Remote Healthcare Monitoring Framework (FTRHMF). This framework comprises multiple entities, including Wireless Body Sensors (WBS), Distributed Gateway (DGW), Distributed Edge Server (DES), Blockchain Server (BS), and Cloud Server (CS). The framework operates in four tiers. In the first tier, WBS and DGW are authenticated to the BS using secret credentials, ensuring privacy and security for all entities. In the second tier, authenticated WBS transmit data to the DGW via a two-level Hybridized Metaheuristic Secure Federated Clustered Routing Protocol (HyMSFCRP), which leverages Mountaineering Team-Based Optimization (MTBO) and Sea Horse Optimization (SHO) algorithms. In the third tier, sensor reports are prioritized and analyzed using Multi-Agent Deep Reinforcement Learning (MA-DRL), with the results fed into the Hybrid-Transformer Deep Learning (HTDL) model. This model combines Lite Convolutional Neural Network and Swin Transformer networks to detect patient outcomes accurately. Finally, in the fourth tier, patients' outcomes are securely stored in a cloud-assisted redactable blockchain layer, allowing modifications without compromising the integrity of the original data. This research enhance the network lifetime by 18.3 %, reduce the transmission delays by 15.6 %, ensures classification accuracy of 7.4 %, with PSNR of 46.12 dB, SSIM of 0.8894, and MAE of 22.51 when compared to the existing works.
{"title":"An end-to-end four tier remote healthcare monitoring framework using edge-cloud computing and redactable blockchain","authors":"Naif Alsharabi, Abdulaziz Alayba, Gharbi Alshammari, Mohammad Alsaffar, Amr Jadi","doi":"10.1016/j.compbiomed.2025.109987","DOIUrl":"10.1016/j.compbiomed.2025.109987","url":null,"abstract":"<div><div>The Medical Internet of Things (MIoTs) encompasses compact, energy-efficient wireless sensor devices designed to monitor patients' body outcomes. Healthcare networks provide constant data monitoring, enabling patients to live independently. Despite advancements in MIoTs, critical issues persist that can affect the Quality of Service (QoS) in the network. The wearable IoT module collects data and stores it on cloud servers, making it vulnerable to privacy breaches and attacks by unauthorized users. To address these challenges, we propose an end-to-end secure remote healthcare framework called the Four Tier Remote Healthcare Monitoring Framework (FTRHMF). This framework comprises multiple entities, including Wireless Body Sensors (WBS), Distributed Gateway (DGW), Distributed Edge Server (DES), Blockchain Server (BS), and Cloud Server (CS). The framework operates in four tiers. In the first tier, WBS and DGW are authenticated to the BS using secret credentials, ensuring privacy and security for all entities. In the second tier, authenticated WBS transmit data to the DGW via a two-level Hybridized Metaheuristic Secure Federated Clustered Routing Protocol (HyMSFCRP), which leverages Mountaineering Team-Based Optimization (MTBO) and Sea Horse Optimization (SHO) algorithms. In the third tier, sensor reports are prioritized and analyzed using Multi-Agent Deep Reinforcement Learning (MA-DRL), with the results fed into the Hybrid-Transformer Deep Learning (HTDL) model. This model combines Lite Convolutional Neural Network and Swin Transformer networks to detect patient outcomes accurately. Finally, in the fourth tier, patients' outcomes are securely stored in a cloud-assisted redactable blockchain layer, allowing modifications without compromising the integrity of the original data. This research enhance the network lifetime by 18.3 %, reduce the transmission delays by 15.6 %, ensures classification accuracy of 7.4 %, with PSNR of 46.12 dB, SSIM of 0.8894, and MAE of 22.51 when compared to the existing works.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109987"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1016/j.compbiomed.2025.109955
André Teixeira de Frades , Eduardo Luís de Aquino Neves , Oscar Felipe Falcão Raposo , Adriano Antunes de Souza Araújo
Background:
The aim of this study was to develop an objective computational solution, including a smartphone app, to evaluate colorimetric quantification of a membrane designed to diagnose small fiber peripheral neuropathies in individual with diabetes. This membrane, SudoPad, was developed to improve diagnostic speed and accuracy, while minimizing patient discomfort.
Methods:
SudoPad is polymeric adhesive membrane made from sodium alginate, glycerol, Alizarin Red S, and sodium carbonate. A pilot study for a clinical trial was conducted with three groups to evaluate the membrane’s effectiveness. Statistical analysis focused on calculating positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and accuracy.
Results:
The SudoPad app demonstrated an PPV of 88.46%, NPV of 85.90%, sensitivity of 67.65% (58,65% to 76,64% - CI 95%), specificity of 95.71% (91,81% to 99,60% - CI 95%) and accuracy of 86.54%. These metrics indicated the system’s effectiveness in diagnosing peripheral neuropathies with a significant improvement in diagnostic accuracy.
Conclusions:
The results suggest that SudoPad,a low-cost, biodegradable membrane, could enhance health technologies for diagnosing neuropathies. It offers a faster, accurate, and more comfortable alternative to current diagnostic methods.
{"title":"Computer-aided diagnostic screening of diabetic peripheral neuropathy using colorimetric membrane analysis","authors":"André Teixeira de Frades , Eduardo Luís de Aquino Neves , Oscar Felipe Falcão Raposo , Adriano Antunes de Souza Araújo","doi":"10.1016/j.compbiomed.2025.109955","DOIUrl":"10.1016/j.compbiomed.2025.109955","url":null,"abstract":"<div><h3>Background:</h3><div>The aim of this study was to develop an objective computational solution, including a smartphone app, to evaluate colorimetric quantification of a membrane designed to diagnose small fiber peripheral neuropathies in individual with diabetes. This membrane, SudoPad, was developed to improve diagnostic speed and accuracy, while minimizing patient discomfort.</div></div><div><h3>Methods:</h3><div>SudoPad is polymeric adhesive membrane made from sodium alginate, glycerol, Alizarin Red S, and sodium carbonate. A pilot study for a clinical trial was conducted with three groups to evaluate the membrane’s effectiveness. Statistical analysis focused on calculating positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and accuracy.</div></div><div><h3>Results:</h3><div>The SudoPad app demonstrated an PPV of 88.46%, NPV of 85.90%, sensitivity of 67.65% (58,65% to 76,64% - CI 95%), specificity of 95.71% (91,81% to 99,60% - CI 95%) and accuracy of 86.54%. These metrics indicated the system’s effectiveness in diagnosing peripheral neuropathies with a significant improvement in diagnostic accuracy.</div></div><div><h3>Conclusions:</h3><div>The results suggest that SudoPad,a low-cost, biodegradable membrane, could enhance health technologies for diagnosing neuropathies. It offers a faster, accurate, and more comfortable alternative to current diagnostic methods.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109955"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients.
We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.
{"title":"Machine learning prediction of overall survival in prostate adenocarcinoma using ensemble techniques","authors":"Declan Ikechukwu Emegano , Mubarak Taiwo Mustapha , Dilber Uzun Ozsahin , Ilker Ozsahin","doi":"10.1016/j.compbiomed.2025.110008","DOIUrl":"10.1016/j.compbiomed.2025.110008","url":null,"abstract":"<div><div>Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients.</div><div>We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 110008"},"PeriodicalIF":7.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-11DOI: 10.1016/j.compbiomed.2025.110001
Gyan Prakash Rai, Asheesh Shanker
Epidermal growth factor receptor (EGFR), the first receptor tyrosine kinase, plays a critical role in neoplastic metastasis, angiogenesis, tumor invasion, and apoptosis, making it a prime target for treating non-small cell lung cancer (NSCLC). Although tyrosine kinase inhibitors (TKIs) have shown high efficacy and promise for cancer patients, resistance to these drugs often develops within a year due to alterations. The present study investigates the compensatory alterations in EGFR to understand the evolutionary process behind drug resistance. Our findings reveal that coevolutionary alterations expand the drug-binding pocket; leading to reduced drug efficacy and suggested that such changes significantly influence the structural adaptation of the EGFR against these drugs. Analysis such as root mean square deviation (RMSD), root mean square fluctuation (RMSF), solvent accessible surface area (SASA), principal component analysis (PCA), and free energy landscape (FEL) demonstrated that structures of wild EGFR docked with gefitinib are more stable which suggests its susceptibility towards drug than coevolution dependent double mutant. The findings were supported by MM-GBSA binding affinity analysis. The insights from this study highlighted the evolution-induced structural changes which contributes to drug resistance in EGFR and may certainly aid in designing more effective drugs.
{"title":"The coevolutionary landscape of drug resistance in epidermal growth factor receptor: A cancer perspective","authors":"Gyan Prakash Rai, Asheesh Shanker","doi":"10.1016/j.compbiomed.2025.110001","DOIUrl":"10.1016/j.compbiomed.2025.110001","url":null,"abstract":"<div><div>Epidermal growth factor receptor (EGFR), the first receptor tyrosine kinase, plays a critical role in neoplastic metastasis, angiogenesis, tumor invasion, and apoptosis, making it a prime target for treating non-small cell lung cancer (NSCLC). Although tyrosine kinase inhibitors (TKIs) have shown high efficacy and promise for cancer patients, resistance to these drugs often develops within a year due to alterations. The present study investigates the compensatory alterations in EGFR to understand the evolutionary process behind drug resistance. Our findings reveal that coevolutionary alterations expand the drug-binding pocket; leading to reduced drug efficacy and suggested that such changes significantly influence the structural adaptation of the EGFR against these drugs. Analysis such as root mean square deviation (RMSD), root mean square fluctuation (RMSF), solvent accessible surface area (SASA), principal component analysis (PCA), and free energy landscape (FEL) demonstrated that structures of wild EGFR docked with gefitinib are more stable which suggests its susceptibility towards drug than coevolution dependent double mutant. The findings were supported by MM-GBSA binding affinity analysis. The insights from this study highlighted the evolution-induced structural changes which contributes to drug resistance in EGFR and may certainly aid in designing more effective drugs.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 110001"},"PeriodicalIF":7.0,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143593548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-11DOI: 10.1016/j.compbiomed.2025.109956
Md Muhaiminul Islam Nafi
Among various post-translational modifications (PTMs), predicting C-linked and S-linked glycosites is an essential task, yet experimental techniques such as Capillary Electrophoresis (CE), Enzymatic Deglycosylation, and Mass Spectrometry (MS) are expensive. Therefore, computational techniques are required to predict these glycosites. Here, different language model embeddings and sequential features were explored. Two separate feature selection methods: Recursive Feature Elimination (RFE) and Particle Swarm Optimization (PSO) were employed and utilized for identifying the optimal feature set. Cross-validation results were generated for choosing the final models. Three sampling strategies to handle imbalanced datasets were examined: Random undersampling, Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling Approach for Imbalanced Learning (ADASYN).
In this study, two models: DeepCSEmbed-C and DeepCSEmbed-S are proposed for C-linked and S-linked glycosylation prediction respectively. DeepCSEmbed-C is a dual-branch deep learning model comprising a Feedforward Neural Network (FNN) branch and an Inception branch, coupled with a Random undersampling strategy. DeepCSEmbed-S is a Categorical Boosting (CAT) model with the SMOTE oversampling strategy. DeepCSEmbed-C outperformed available state-of-the-art (SOTA) methods, achieving 92.9% sensitivity, 95.1% F1-score and 90.6% MCC on the Independent dataset. Datasets and python scripts for training and testing the models are provided and made freely accessible at https://github.com/nafcoder/DeepCSEmbed.
{"title":"Predicting C- and S-linked Glycosylation sites from protein sequences using protein language models","authors":"Md Muhaiminul Islam Nafi","doi":"10.1016/j.compbiomed.2025.109956","DOIUrl":"10.1016/j.compbiomed.2025.109956","url":null,"abstract":"<div><div>Among various post-translational modifications (PTMs), predicting C-linked and S-linked glycosites is an essential task, yet experimental techniques such as Capillary Electrophoresis (CE), Enzymatic Deglycosylation, and Mass Spectrometry (MS) are expensive. Therefore, computational techniques are required to predict these glycosites. Here, different language model embeddings and sequential features were explored. Two separate feature selection methods: Recursive Feature Elimination (RFE) and Particle Swarm Optimization (PSO) were employed and utilized for identifying the optimal feature set. Cross-validation results were generated for choosing the final models. Three sampling strategies to handle imbalanced datasets were examined: Random undersampling, Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling Approach for Imbalanced Learning (ADASYN).</div><div>In this study, two models: DeepCSEmbed-C and DeepCSEmbed-S are proposed for C-linked and S-linked glycosylation prediction respectively. DeepCSEmbed-C is a dual-branch deep learning model comprising a Feedforward Neural Network (FNN) branch and an Inception branch, coupled with a Random undersampling strategy. DeepCSEmbed-S is a Categorical Boosting (CAT) model with the SMOTE oversampling strategy. DeepCSEmbed-C outperformed available state-of-the-art (SOTA) methods, achieving 92.9% sensitivity, 95.1% F1-score and 90.6% MCC on the Independent dataset. Datasets and python scripts for training and testing the models are provided and made freely accessible at <span><span>https://github.com/nafcoder/DeepCSEmbed</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109956"},"PeriodicalIF":7.0,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143593549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Breast cancer is a complicated type of cancer that mainly occurs in women and poses a global challenge due to its genetic diversity, making accurate diagnosis challenging. The accepted approaches are categorized based on cancer subtype and metastasis level. This study focuses on a predictive drug discovery strategy for compounds that may modulate interaction with HER-2 and EGFR, two important receptors in cancer treatment. We employed a 3D QSAR methodology, complemented by molecular docking, ADMET analysis, and molecular dynamics simulations, to evaluate the antiproliferative effects of pyrazole-benzimidazole derivatives on MCF-7 cells as targeted therapies. External validation confirmed the predictive accuracy of the generated models. The best CoMSIA (Comparative Molecular Similarity Indices Analysis) and CoMFA (Comparative Molecular Field Analysis) models exhibited significant , , and values, emphasizing the role of electrostatic and hydrophobic fields in inhibiting breast cancer cell growth. These findings provided a foundation for designing and predicting the biological effects of potent inhibitors. Additionally, ADMET analysis was conducted to evaluate the drug-likeness of the newly designed ligands, while the stability of the complexes was confirmed by molecular dynamics simulations, which validate the binding stability of the selected chemicals. MMPBSA, PCA, and FEL investigations provide further support for this assertion, reinforcing the robustness of our conclusions.
{"title":"Pyrazole-benzimidazole derivatives targeting MCF-7 breast cancer cells as potential anti-proliferative agents. 3D QSAR and In-silico investigations via molecular docking and molecular dynamics simulations","authors":"Etibaria Belghalia , Farid Elbamtari , Motasim Jawi , Abdelkrim Guendouzi , Abdelouahid Sbai , M'barek Choukrad , Tahar Lakhlifi , Mohammed Bouachrine","doi":"10.1016/j.compbiomed.2025.109969","DOIUrl":"10.1016/j.compbiomed.2025.109969","url":null,"abstract":"<div><div>Breast cancer is a complicated type of cancer that mainly occurs in women and poses a global challenge due to its genetic diversity, making accurate diagnosis challenging. The accepted approaches are categorized based on cancer subtype and metastasis level. This study focuses on a predictive drug discovery strategy for compounds that may modulate interaction with HER-2 and EGFR, two important receptors in cancer treatment. We employed a 3D QSAR methodology, complemented by molecular docking, ADMET analysis, and molecular dynamics simulations, to evaluate the antiproliferative effects of pyrazole-benzimidazole derivatives on MCF-7 cells as targeted therapies. External validation confirmed the predictive accuracy of the generated models. The best CoMSIA (Comparative Molecular Similarity Indices Analysis) and CoMFA (Comparative Molecular Field Analysis) models exhibited significant <span><math><mrow><msup><mrow><mspace></mspace><mi>Q</mi></mrow><mn>2</mn></msup></mrow></math></span>, <span><math><mrow><msup><mrow><mspace></mspace><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span>, and <span><math><mrow><msubsup><mi>R</mi><mrow><mi>T</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow><mn>2</mn></msubsup></mrow></math></span> values, emphasizing the role of electrostatic and hydrophobic fields in inhibiting breast cancer cell growth. These findings provided a foundation for designing and predicting the biological effects of potent inhibitors. Additionally, ADMET analysis was conducted to evaluate the drug-likeness of the newly designed ligands, while the stability of the complexes was confirmed by molecular dynamics simulations, which validate the binding stability of the selected chemicals. MMPBSA, PCA, and FEL investigations provide further support for this assertion, reinforcing the robustness of our conclusions.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109969"},"PeriodicalIF":7.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143593671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-10DOI: 10.1016/j.compbiomed.2025.109992
Qing Su , Wenting Liu , Xiaoyan Liu , Pixiong Su , Boqia Xie
Background
Coronary microvascular disease (CMVD), marked by dysfunction of the small coronary vessels, poses significant diagnostic challenges due to the complexity and high cost of current procedures like the index of microcirculatory resistance (IMR). This study aimed to identify metabolomic biomarkers from coronary artery samples to facilitate CMVD diagnosis using advanced bioinformatics techniques—specifically, random forest algorithms and generalized linear models (GLMs)—to develop more cost-effective blood-based diagnostics.
Methods
In this prospective study, 68 patients scheduled for coronary angiography and IMR assessment were enrolled. Plasma samples obtained from their coronary arteries were analyzed using untargeted metabolomics with liquid chromatography-mass spectrometry. Advanced bioinformatics methods were applied: random forest algorithms were utilized for feature selection to identify significant metabolites, and GLMs were constructed for predictive modeling. The diagnostic performance of the models was evaluated through receiver operating characteristic (ROC) curve analysis.
Results
The random forest analysis identified the top 10 metabolites that significantly contributed to the classification of CMVD. The GLM built using these metabolites demonstrated excellent diagnostic accuracy, achieving area under the ROC curve (AUC) values of 0.984 in the initial (discovery) cohort and 0.938 in the subsequent (validation) cohort. The use of mathematical modeling enhanced the robustness and interpretability of the biomarker selection process.
Conclusions
Advanced bioinformatics techniques, including random forest algorithms and GLMs, effectively identified key metabolites associated with CMVD. While the collection of coronary artery blood samples is invasive due to the necessity of coronary angiography, this method offers a more practical and cost-effective alternative to IMR measurement, potentially improving the diagnostic approach for CMVD.
{"title":"Bioinformatics-focused identification of metabolomic Markers in coronary microvascular disease","authors":"Qing Su , Wenting Liu , Xiaoyan Liu , Pixiong Su , Boqia Xie","doi":"10.1016/j.compbiomed.2025.109992","DOIUrl":"10.1016/j.compbiomed.2025.109992","url":null,"abstract":"<div><h3>Background</h3><div>Coronary microvascular disease (CMVD), marked by dysfunction of the small coronary vessels, poses significant diagnostic challenges due to the complexity and high cost of current procedures like the index of microcirculatory resistance (IMR). This study aimed to identify metabolomic biomarkers from coronary artery samples to facilitate CMVD diagnosis using advanced bioinformatics techniques—specifically, random forest algorithms and generalized linear models (GLMs)—to develop more cost-effective blood-based diagnostics.</div></div><div><h3>Methods</h3><div>In this prospective study, 68 patients scheduled for coronary angiography and IMR assessment were enrolled. Plasma samples obtained from their coronary arteries were analyzed using untargeted metabolomics with liquid chromatography-mass spectrometry. Advanced bioinformatics methods were applied: random forest algorithms were utilized for feature selection to identify significant metabolites, and GLMs were constructed for predictive modeling. The diagnostic performance of the models was evaluated through receiver operating characteristic (ROC) curve analysis.</div></div><div><h3>Results</h3><div>The random forest analysis identified the top 10 metabolites that significantly contributed to the classification of CMVD. The GLM built using these metabolites demonstrated excellent diagnostic accuracy, achieving area under the ROC curve (AUC) values of 0.984 in the initial (discovery) cohort and 0.938 in the subsequent (validation) cohort. The use of mathematical modeling enhanced the robustness and interpretability of the biomarker selection process.</div></div><div><h3>Conclusions</h3><div>Advanced bioinformatics techniques, including random forest algorithms and GLMs, effectively identified key metabolites associated with CMVD. While the collection of coronary artery blood samples is invasive due to the necessity of coronary angiography, this method offers a more practical and cost-effective alternative to IMR measurement, potentially improving the diagnostic approach for CMVD.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 109992"},"PeriodicalIF":7.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143578772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}