Interpretable machine learning is an emerging trend that holds significant importance, considering the growing impact of machine learning systems on society and human lives. Many interpretability methods are applied in CNN after training to provide deeper insights into the outcomes, but only a few have tried to promote interpretability during training. The aim of this experimental study is to investigate the interpretability of CNN. This research was applied to chest computed tomography scans, as understanding CNN predictions has particular importance in the automatic classification of medical images. We attempted to implement a CNN technique aimed at improving interpretability by relating filters in the last convolutional to specific output classes. Variations of such a technique were explored and assessed using chest CT images for classification based on the presence of lungs and lesions. A search was conducted to optimize the specific hyper-parameters necessary for the evaluated strategies. A novel strategy is proposed employing transfer learning and regularization. Models obtained with this strategy and the optimized hyperparameters were statistically compared to standard models, demonstrating greater interpretability without a significant loss in predictive accuracy. We achieved CNN models with improved interpretability, which is crucial for the development of more explainable and reliable AI systems.
{"title":"Improving CNN interpretability and evaluation via alternating training and regularization in chest CT scans","authors":"Rodrigo Ramos-Díaz , Jesús García-Ramírez , Jimena Olveres , Boris Escalante-Ramírez","doi":"10.1016/j.ibmed.2025.100211","DOIUrl":"10.1016/j.ibmed.2025.100211","url":null,"abstract":"<div><div>Interpretable machine learning is an emerging trend that holds significant importance, considering the growing impact of machine learning systems on society and human lives. Many interpretability methods are applied in CNN after training to provide deeper insights into the outcomes, but only a few have tried to promote interpretability during training. The aim of this experimental study is to investigate the interpretability of CNN. This research was applied to chest computed tomography scans, as understanding CNN predictions has particular importance in the automatic classification of medical images. We attempted to implement a CNN technique aimed at improving interpretability by relating filters in the last convolutional to specific output classes. Variations of such a technique were explored and assessed using chest CT images for classification based on the presence of lungs and lesions. A search was conducted to optimize the specific hyper-parameters necessary for the evaluated strategies. A novel strategy is proposed employing transfer learning and regularization. Models obtained with this strategy and the optimized hyperparameters were statistically compared to standard models, demonstrating greater interpretability without a significant loss in predictive accuracy. We achieved CNN models with improved interpretability, which is crucial for the development of more explainable and reliable AI systems.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100211"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143388250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100223
Kiruthiga Devi Murugavel , Parthasarathy R , Sandeep Kumar Mathivanan , Saravanan Srinivasan , Basu Dev Shivahare , Mohd Asif Shah
Mood fluctuations that can vary from manic to depressive states are a symptom of a disease known as bipolar disorder, which affects mental health. Interviews with patients and gathering information from their families are essential steps in the diagnostic process for bipolar disorder. Automated approaches for treating bipolar disorder are also being explored. In mental health prevention and care, machine learning techniques (ML) are increasingly used to detect and treat diseases. With frequently analyzed human behaviour patterns, identified symptoms, and risk factors as various parameters of the dataset, predictions can be made for improving traditional diagnosis methods. In this study, A Multimodal Fusion System was developed based on an auditory, linguistic, and visual patient recording as an input dataset for a three-stage mania classification decision system. Deep Denoising Autoencoders (DDAEs) are introduced to learn common representations across five modalities: acoustic characteristics, eye gaze, facial landmarks, head posture, and Facial Action Units (FAUs). This is done in particular for the audio-visual modality. The distributed representations and the transient information during each recording session are eventually encoded into Fisher Vectors (FVs), which capture the representations. Once the Fisher Vectors (FVs) and document embeddings are integrated, a Multi-Task Neural Network is used to perform the classification task, while mitigating overfitting issues caused by the limited size of the bipolar disorder dataset. The study introduces Deep Denoising Autoencoders (DDAEs) for cross-modal representation learning and utilizes Fisher Vectors with Multi-Task Neural Networks, enhancing diagnostic accuracy while highlighting the benefits of multimodal fusion for mental health diagnostics. Achieving an unweighted average recall score of 64.8 %, with the highest AUC-ROC of 0.85 & less interface time of 6.5 ms/sample scores the effectiveness of integrating multiple modalities in improving system performance and advancing feature representation and model interpretability.
{"title":"A multimodal machine learning model for bipolar disorder mania classification: Insights from acoustic, linguistic, and visual cues","authors":"Kiruthiga Devi Murugavel , Parthasarathy R , Sandeep Kumar Mathivanan , Saravanan Srinivasan , Basu Dev Shivahare , Mohd Asif Shah","doi":"10.1016/j.ibmed.2025.100223","DOIUrl":"10.1016/j.ibmed.2025.100223","url":null,"abstract":"<div><div>Mood fluctuations that can vary from manic to depressive states are a symptom of a disease known as bipolar disorder, which affects mental health. Interviews with patients and gathering information from their families are essential steps in the diagnostic process for bipolar disorder. Automated approaches for treating bipolar disorder are also being explored. In mental health prevention and care, machine learning techniques (ML) are increasingly used to detect and treat diseases. With frequently analyzed human behaviour patterns, identified symptoms, and risk factors as various parameters of the dataset, predictions can be made for improving traditional diagnosis methods. In this study, A Multimodal Fusion System was developed based on an auditory, linguistic, and visual patient recording as an input dataset for a three-stage mania classification decision system. Deep Denoising Autoencoders (DDAEs) are introduced to learn common representations across five modalities: acoustic characteristics, eye gaze, facial landmarks, head posture, and Facial Action Units (FAUs). This is done in particular for the audio-visual modality. The distributed representations and the transient information during each recording session are eventually encoded into Fisher Vectors (FVs), which capture the representations. Once the Fisher Vectors (FVs) and document embeddings are integrated, a Multi-Task Neural Network is used to perform the classification task, while mitigating overfitting issues caused by the limited size of the bipolar disorder dataset. The study introduces Deep Denoising Autoencoders (DDAEs) for cross-modal representation learning and utilizes Fisher Vectors with Multi-Task Neural Networks, enhancing diagnostic accuracy while highlighting the benefits of multimodal fusion for mental health diagnostics. Achieving an unweighted average recall score of 64.8 %, with the highest AUC-ROC of 0.85 & less interface time of 6.5 ms/sample scores the effectiveness of integrating multiple modalities in improving system performance and advancing feature representation and model interpretability.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100223"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100242
Albert C. Yang , Wei-Ming Ma , Dung-Hung Chiang , Yi-Ze Liao , Hsien-Yung Lai , Shu-Chuan Lin , Mei-Chin Liu , Kai-Ting Wen , Tzong-Huei Lin , Wen-Xiang Tsai , Jun-Ding Zhu , Ting-Yu Chen , Hung-Fu Lee , Pei-Hung Liao , Huey-Wen Yien , Chien-Ying Wang
We aimed to develop an early warning system to predict sepsis based solely on single time-point and non-invasive vital signs, and to evaluate its correlation with related biomarkers, namely C-reactive protein (CRP) and Procalcitonin (PCT). We utilized retrospective data from Physionet and four medical centers in Taiwan, encompassing a total of 46,184 Intensive Care Unit (ICU) patients, to develop and validate a machine learning algorithm based on XGBoost for predicting sepsis. The model was specifically designed to use non-invasive vital signs captured at a single time point, The correlation between sepsis AI prediction model and levels of CRP and PCT was evaluated. The developed model demonstrated balanced performance across various datasets, with an average recall of 0.908 and precision of 0.577. The model's performance was further validated by the independent dataset from Cheng-Hsin General Hospital (recall: 0.986, precision: 0.585). Temperature, systolic blood pressure, and respiration rate were the top contributing predictors in the model. A significant correlation was observed between the model's sepsis predictions and elevated CRP levels, while PCT showed a less consistent pattern. Our approach, combining AI algorithms with vital sign data and its clinical relevance to CRP level, offers a more precise and timely sepsis detection, with the potential to improve care in emergency and critical care settings.
{"title":"Early prediction of sepsis using an XGBoost model with single time-point non-invasive vital signs and its correlation with C-reactive protein and procalcitonin: A multi-center study","authors":"Albert C. Yang , Wei-Ming Ma , Dung-Hung Chiang , Yi-Ze Liao , Hsien-Yung Lai , Shu-Chuan Lin , Mei-Chin Liu , Kai-Ting Wen , Tzong-Huei Lin , Wen-Xiang Tsai , Jun-Ding Zhu , Ting-Yu Chen , Hung-Fu Lee , Pei-Hung Liao , Huey-Wen Yien , Chien-Ying Wang","doi":"10.1016/j.ibmed.2025.100242","DOIUrl":"10.1016/j.ibmed.2025.100242","url":null,"abstract":"<div><div>We aimed to develop an early warning system to predict sepsis based solely on single time-point and non-invasive vital signs, and to evaluate its correlation with related biomarkers, namely C-reactive protein (CRP) and Procalcitonin (PCT). We utilized retrospective data from Physionet and four medical centers in Taiwan, encompassing a total of 46,184 Intensive Care Unit (ICU) patients, to develop and validate a machine learning algorithm based on XGBoost for predicting sepsis. The model was specifically designed to use non-invasive vital signs captured at a single time point, The correlation between sepsis AI prediction model and levels of CRP and PCT was evaluated. The developed model demonstrated balanced performance across various datasets, with an average recall of 0.908 and precision of 0.577. The model's performance was further validated by the independent dataset from Cheng-Hsin General Hospital (recall: 0.986, precision: 0.585). Temperature, systolic blood pressure, and respiration rate were the top contributing predictors in the model. A significant correlation was observed between the model's sepsis predictions and elevated CRP levels, while PCT showed a less consistent pattern. Our approach, combining AI algorithms with vital sign data and its clinical relevance to CRP level, offers a more precise and timely sepsis detection, with the potential to improve care in emergency and critical care settings.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100242"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143747482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100243
Fahima Hossain, Rajib Kumar Halder, Mohammed Nasir Uddin
Alzheimer's disease (AD) is a degenerative neurological condition that impairs cognitive functioning. Early detection is critical for slowing disease progression and limiting brain damage. Although machine learning and deep learning models help identify Alzheimer's disease, their accuracy and efficiency are widely questioned. This study provides an integrated system for classifying four AD phases from 6400 MRI scans using pre-trained neural networks and machine learning classifiers. Preprocessing steps include noise removal, image enhancement (AGCWD, Bilateral Filter), and segmentation. Intensity normalization and data augmentation methods are applied to improve model generalization. Two models are developed: the first employs pre-trained neural net-works (VGG16, VGG19, DenseNet201, ResNet50, EfficientNetV7, InceptionV3, InceptionResNetV2, and MobileNet) for both feature extraction and classification. In contrast, the second integrates features from these networks with machine learning classifiers (XGBoost, Random Forest, SVM, KNN, Gradient Boosting, AdaBoost, Decision Tree, Linear Discriminant Analysis, Logistic Regression, and Multilayer Perceptron). The second model incorporates an adaptive error minimization sys-tem for enhanced accuracy. VGG16 achieved the highest accuracy (99.61 % training and 97.94 % testing), whereas VGG19+MLP with adaptive error minimization achieved 97.08 %, exhibiting superior AD classification ability.
{"title":"An integrated machine learning based adaptive error minimization framework for Alzheimer's stage identification","authors":"Fahima Hossain, Rajib Kumar Halder, Mohammed Nasir Uddin","doi":"10.1016/j.ibmed.2025.100243","DOIUrl":"10.1016/j.ibmed.2025.100243","url":null,"abstract":"<div><div>Alzheimer's disease (AD) is a degenerative neurological condition that impairs cognitive functioning. Early detection is critical for slowing disease progression and limiting brain damage. Although machine learning and deep learning models help identify Alzheimer's disease, their accuracy and efficiency are widely questioned. This study provides an integrated system for classifying four AD phases from 6400 MRI scans using pre-trained neural networks and machine learning classifiers. Preprocessing steps include noise removal, image enhancement (AGCWD, Bilateral Filter), and segmentation. Intensity normalization and data augmentation methods are applied to improve model generalization. Two models are developed: the first employs pre-trained neural net-works (VGG16, VGG19, DenseNet201, ResNet50, EfficientNetV7, InceptionV3, InceptionResNetV2, and MobileNet) for both feature extraction and classification. In contrast, the second integrates features from these networks with machine learning classifiers (XGBoost, Random Forest, SVM, KNN, Gradient Boosting, AdaBoost, Decision Tree, Linear Discriminant Analysis, Logistic Regression, and Multilayer Perceptron). The second model incorporates an adaptive error minimization sys-tem for enhanced accuracy. VGG16 achieved the highest accuracy (99.61 % training and 97.94 % testing), whereas VGG19+MLP with adaptive error minimization achieved 97.08 %, exhibiting superior AD classification ability.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100243"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143807121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100213
S. Rajaprakash , C. Bagath Basha , C. Sunitha Ram , I. Ameethbasha , V. Subapriya , R. Sofia
Autism spectrum disorder (ASD) study faces several challenges, including variations in brain connectivity patterns, small sample sizes, and data heterogeneity detection by magnetic resonance imaging (MRI). These issues make it challenging to identify consistent imaging modalities. Researchers have explored improved analysis techniques to solve the above problem via multimodal imaging and graph-based methods. Therefore, it is better to understand ASD neurology. The current techniques focus mainly on pairwise comparisons between individuals and often overlook features and individual characteristics. To overcome these limitations, in the proposed novel method, a multiscale enhanced graph with a convolutional network is used for ASD detection.
This work integrates non-imaging phenotypic data (from brain imaging data) with functional connectivity data (from Functional magnetic resonance images). In this approach, the population graph represents all individuals as vertices. The phenotypic data were used to calculate the weight between vertices in the graph using the fuzzy inference system. Fuzzy if-then rules, is used to determine the similarity between the phenotypic data. Each vertice connects feature vectors derived from the image data. The vertices and weights of each edge are used to incorporate phenotypic information. A random walk with a fuzzy MSE-GCN framework employs multiple parallel GCN layer embeddings. The outputs from these layers are joined in a completely linked layer to detect ASD efficiently. We assessed the performance of this background by the ABIDE data set and utilized recursive feature elimination and a multilayer perceptron for feature selection. This method achieved an accuracy rate of 87 % better than the current study.
{"title":"Using convolutional network in graphical model detection of autism disorders with fuzzy inference systems","authors":"S. Rajaprakash , C. Bagath Basha , C. Sunitha Ram , I. Ameethbasha , V. Subapriya , R. Sofia","doi":"10.1016/j.ibmed.2025.100213","DOIUrl":"10.1016/j.ibmed.2025.100213","url":null,"abstract":"<div><div>Autism spectrum disorder (ASD) study faces several challenges, including variations in brain connectivity patterns, small sample sizes, and data heterogeneity detection by magnetic resonance imaging (MRI). These issues make it challenging to identify consistent imaging modalities. Researchers have explored improved analysis techniques to solve the above problem via multimodal imaging and graph-based methods. Therefore, it is better to understand ASD neurology. The current techniques focus mainly on pairwise comparisons between individuals and often overlook features and individual characteristics. To overcome these limitations, in the proposed novel method, a multiscale enhanced graph with a convolutional network is used for ASD detection.</div><div>This work integrates non-imaging phenotypic data (from brain imaging data) with functional connectivity data (from Functional magnetic resonance images). In this approach, the population graph represents all individuals as vertices. The phenotypic data were used to calculate the weight between vertices in the graph using the fuzzy inference system. Fuzzy if-then rules, is used to determine the similarity between the phenotypic data. Each vertice connects feature vectors derived from the image data. The vertices and weights of each edge are used to incorporate phenotypic information. A random walk with a fuzzy MSE-GCN framework employs multiple parallel GCN layer embeddings. The outputs from these layers are joined in a completely linked layer to detect ASD efficiently. We assessed the performance of this background by the ABIDE data set and utilized recursive feature elimination and a multilayer perceptron for feature selection. This method achieved an accuracy rate of 87 % better than the current study.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100213"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143350535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100231
Md Shakhawat Hossain , Munim Ahmed , Md Sahilur Rahman , MM Mahbubul Syeed , Mohammad Faisal Uddin
Ovarian cancer (OC) ranks fifth in all cancer-related fatalities in women. Epithelial ovarian cancer (EOC) is a subclass of OC, accounting for 95 % of all patients. Conventional treatment for EOC is debulking surgery with adjuvant Chemotherapy; however, in 70 % of cases, this leads to progressive resistance and tumor recurrence. The United States Food and Drug Administration (FDA) recently approved Bevacizumab therapy for EOC patients. Bevacizumab improved survival and decreased recurrence in 30 % of cases, while the rest reported side effects, which include severe hypertension (27 %), thrombocytopenia (26 %), bleeding issues (39 %), heart problems (11 %), kidney problems (7 %), intestinal perforation and delayed wound healing. Moreover, it is costly; single-cycle Bevacizumab therapy costs approximately $3266. Therefore, selecting patients for this therapy is critical due to the high cost, probable adverse effects and small beneficiaries. Several methods were proposed previously; however, they failed to attain adequate accuracy. We present an AI-driven method to predict the effect from H&E whole slide image (WSI) produced from a patient's biopsy. We trained multiple CNN and transformer models using 10 × and 20 × images to predict the effect. Finally, the Data Efficient Image Transformer (DeiT) model was selected considering its high accuracy, interoperability and time efficiency. The proposed method achieved 96.60 % test accuracy and 93 % accuracy in 5-fold cross-validation and can predict the effect in less than 30 s. This method outperformed the state-of-the-art test accuracy (85.10 %) by 11 % and cross-validation accuracy (88.2 %) by 5 %. High accuracy and low prediction time ensured the efficacy of the proposed method.
{"title":"Predicting the effect of Bevacizumab therapy in ovarian cancer from H&E whole slide images using transformer model","authors":"Md Shakhawat Hossain , Munim Ahmed , Md Sahilur Rahman , MM Mahbubul Syeed , Mohammad Faisal Uddin","doi":"10.1016/j.ibmed.2025.100231","DOIUrl":"10.1016/j.ibmed.2025.100231","url":null,"abstract":"<div><div>Ovarian cancer (OC) ranks fifth in all cancer-related fatalities in women. Epithelial ovarian cancer (EOC) is a subclass of OC, accounting for 95 % of all patients. Conventional treatment for EOC is debulking surgery with adjuvant Chemotherapy; however, in 70 % of cases, this leads to progressive resistance and tumor recurrence. The United States Food and Drug Administration (FDA) recently approved Bevacizumab therapy for EOC patients. Bevacizumab improved survival and decreased recurrence in 30 % of cases, while the rest reported side effects, which include severe hypertension (27 %), thrombocytopenia (26 %), bleeding issues (39 %), heart problems (11 %), kidney problems (7 %), intestinal perforation and delayed wound healing. Moreover, it is costly; single-cycle Bevacizumab therapy costs approximately $3266. Therefore, selecting patients for this therapy is critical due to the high cost, probable adverse effects and small beneficiaries. Several methods were proposed previously; however, they failed to attain adequate accuracy. We present an AI-driven method to predict the effect from H&E whole slide image (WSI) produced from a patient's biopsy. We trained multiple CNN and transformer models using 10 × and 20 × images to predict the effect. Finally, the Data Efficient Image Transformer (DeiT) model was selected considering its high accuracy, interoperability and time efficiency. The proposed method achieved 96.60 % test accuracy and 93 % accuracy in 5-fold cross-validation and can predict the effect in less than 30 s. This method outperformed the state-of-the-art test accuracy (85.10 %) by 11 % and cross-validation accuracy (88.2 %) by 5 %. High accuracy and low prediction time ensured the efficacy of the proposed method.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100231"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100207
Salam Bani Hani , Muayyad Ahmad
{"title":"Using big data to predict young adult ischemic vs. non-ischemic heart disease risk factors: An artificial intelligence based model","authors":"Salam Bani Hani , Muayyad Ahmad","doi":"10.1016/j.ibmed.2025.100207","DOIUrl":"10.1016/j.ibmed.2025.100207","url":null,"abstract":"","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100207"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143173634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2024.100197
Matteo Magnini , Gianluca Aguzzi , Sara Montagna
Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns.
This study seeks to address these issues by exploiting a privacy by design architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements.
We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.
医疗聊天机器人正在成为远程医疗应用的重要组成部分,作为帮助患者自我管理病情的工具。这一趋势尤其受到自然语言处理技术与预训练语言模型(LMs)的进步的推动。然而,将LMs集成到临床环境中面临着与可靠性和隐私问题相关的挑战。本研究试图通过利用开源LMs的完全本地部署来利用隐私设计架构解决方案来解决这些问题。具体来说,为了减少信息泄露的风险,我们着重于评估可以部署在个人设备(如智能手机或笔记本电脑)上的开源语言模型(slm)的性能,而不需要严格的硬件要求。我们以高血压管理为例来评估这种解决方案的有效性。模型在各种任务中进行评估,包括意图识别和移情对话,使用Gemini Pro 1.5作为基准。结果表明,对于某些任务,如意图识别,Gemini优于其他模型。然而,通过采用“大型语言模型(LLM)作为判断”的方法来对响应正确性进行语义评估,我们发现了几个与基本事实密切一致的模型。总之,本研究强调了本地部署的slm作为医疗聊天机器人组件的潜力,同时解决了与隐私和可靠性相关的关键问题。
{"title":"Open-source small language models for personal medical assistant chatbots","authors":"Matteo Magnini , Gianluca Aguzzi , Sara Montagna","doi":"10.1016/j.ibmed.2024.100197","DOIUrl":"10.1016/j.ibmed.2024.100197","url":null,"abstract":"<div><div>Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns.</div><div>This study seeks to address these issues by exploiting a <em>privacy by design</em> architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements.</div><div>We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100197"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143173635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100201
Miguel R. Sotelo , Paul Nona , Loren Wagner , Chris Rogers , Julian Booker , Efstathia Andrikopoulou
Background
Understanding the multifactorial determinants of rapid progression in patients with aortic stenosis (AS) remains limited. We aimed to develop and validate a machine learning model (ML) for predicting rapid progression from moderate to severe AS within one year.
Methods
8746 patients were identified with moderate AS across seven healthcare organizations. Three ML models were trained using demographic, and echocardiographic variables, namely Random Forest, XGBoost and causal discovery-logistic regression. An ensemble model was developed integrating the aforementioned three. A total of 3355 patients formed the training and internal validation cohort. External validation was performed on 171 patients from one institution.
Results
An ensemble model was selected due to its superior F1 score and precision in internal validation (0.382 and 0.301, respectively). Its performance on the external validation cohort was modest (F1 score = 0.626, precision = 0.532).
Conclusion
An ensemble model comprising only demographic and echocardiographic variables was shown to have modest performance in predicting one-year progression from moderate to severe AS. Further validation in larger populations, along with integration of comprehensive clinical data, is crucial for broader applicability.
{"title":"Development and validation of a moderate aortic stenosis disease progression model","authors":"Miguel R. Sotelo , Paul Nona , Loren Wagner , Chris Rogers , Julian Booker , Efstathia Andrikopoulou","doi":"10.1016/j.ibmed.2025.100201","DOIUrl":"10.1016/j.ibmed.2025.100201","url":null,"abstract":"<div><h3>Background</h3><div>Understanding the multifactorial determinants of rapid progression in patients with aortic stenosis (AS) remains limited. We aimed to develop and validate a machine learning model (ML) for predicting rapid progression from moderate to severe AS within one year.</div></div><div><h3>Methods</h3><div>8746 patients were identified with moderate AS across seven healthcare organizations. Three ML models were trained using demographic, and echocardiographic variables, namely Random Forest, XGBoost and causal discovery-logistic regression. An ensemble model was developed integrating the aforementioned three. A total of 3355 patients formed the training and internal validation cohort. External validation was performed on 171 patients from one institution.</div></div><div><h3>Results</h3><div>An ensemble model was selected due to its superior F1 score and precision in internal validation (0.382 and 0.301, respectively). Its performance on the external validation cohort was modest (F1 score = 0.626, precision = 0.532).</div></div><div><h3>Conclusion</h3><div>An ensemble model comprising only demographic and echocardiographic variables was shown to have modest performance in predicting one-year progression from moderate to severe AS. Further validation in larger populations, along with integration of comprehensive clinical data, is crucial for broader applicability.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100201"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143174355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.ibmed.2025.100257
Anwar Jimi , Nabila Zrira , Oumaima Guendoul , Ibtissam Benmiloud , Haris Ahmad Khan , Shah Nawaz
One of the most important tasks in computer-aided diagnostics is the automatic segmentation of skin lesions, which plays an essential role in the early diagnosis and treatment of skin cancer. In recent years, the Convolutional Neural Network (CNN) has largely replaced other traditional methods for segmenting skin lesions. However, due to insufficient information and unclear lesion region segmentation, skin lesion image segmentation still has challenges. In this paper, we propose a novel deep medical image segmentation approach named “ESC-UNET” which combines the advantages of CNN and Transformer to effectively leverage local information and long-range dependencies to enhance medical image segmentation. In terms of the local information, we use a CNN-based encoder and decoder framework. The CNN branch mines local information from medical images using the locality of convolution processes and the pre-trained EfficientNetB5 network. As for the long-range dependencies, we build a Transformer branch that emphasizes the global context. In addition, we employ Atrous Spatial Pyramid Pooling (ASPP) to gather network-wide relevant information. The Convolution Block Attention Module (CBAM) is added to the model to promote effective features and suppress ineffective features in segmentation. We have evaluated our network using the ISIC 2016, ISIC 2017, and ISIC 2018 datasets. The results demonstrate the efficiency of the proposed model in segmenting skin lesions.
{"title":"ESC-UNET: A hybrid CNN and Swin Transformers for skin lesion segmentation","authors":"Anwar Jimi , Nabila Zrira , Oumaima Guendoul , Ibtissam Benmiloud , Haris Ahmad Khan , Shah Nawaz","doi":"10.1016/j.ibmed.2025.100257","DOIUrl":"10.1016/j.ibmed.2025.100257","url":null,"abstract":"<div><div>One of the most important tasks in computer-aided diagnostics is the automatic segmentation of skin lesions, which plays an essential role in the early diagnosis and treatment of skin cancer. In recent years, the Convolutional Neural Network (CNN) has largely replaced other traditional methods for segmenting skin lesions. However, due to insufficient information and unclear lesion region segmentation, skin lesion image segmentation still has challenges. In this paper, we propose a novel deep medical image segmentation approach named “ESC-UNET” which combines the advantages of CNN and Transformer to effectively leverage local information and long-range dependencies to enhance medical image segmentation. In terms of the local information, we use a CNN-based encoder and decoder framework. The CNN branch mines local information from medical images using the locality of convolution processes and the pre-trained EfficientNetB5 network. As for the long-range dependencies, we build a Transformer branch that emphasizes the global context. In addition, we employ Atrous Spatial Pyramid Pooling (ASPP) to gather network-wide relevant information. The Convolution Block Attention Module (CBAM) is added to the model to promote effective features and suppress ineffective features in segmentation. We have evaluated our network using the ISIC 2016, ISIC 2017, and ISIC 2018 datasets. The results demonstrate the efficiency of the proposed model in segmenting skin lesions.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"12 ","pages":"Article 100257"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144168078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}