Several Artificial Intelligence (AI) based predictive tools have been developed to predict non-adherence among patients with type 2 diabetes (T2D). Hence, this study aimed to describe and evaluate the methodological quality of AI based predictive tools for identifying T2D patients at high risk of treatment non-adherence.
Methods
A systematic search was conducted across multiple databases including, EMBASE, Cochrane Library, MedLine, and Google Scholar search. The Prediction model Risk Of Bias ASsessment Tool (PROBAST) was used to assess the quality of studies. The performances of tools were assessed by Area Under the Curve (AUC), precision, recall, C-index, accuracy, sensitivity, specificity or F1 score.
Results
Most studies measured predictive ability using AUC (75 %), and some only reported precision (25 %), recall (12.5 %), C-index (12.5 %), accuracy (37.5), sensitivity (12.5 %), specificity (12.5 %) or F1 score (25 %). All tools had moderate to high predictive ability (AUC > 0.70). However, only one study conducted external validation. Demographic characteristics, HbA1c, glucose monitoring data, and treatment details were typical factors used in developing tools.
Conclusions
The existing AI based tools holds significant promise for improving diabetes care. However, future studies should focus on refining the existing tools, validating in other settings, and evaluating the cost-effectiveness of AI-supported interventions.
{"title":"Artificial intelligence based predictive tools for identifying type 2 diabetes patients at high risk of treatment Non-adherence: A systematic review","authors":"Malede Berihun Yismaw , Chernet Tafere , Bereket Bahiru Tefera , Desalegn Getnet Demsie , Kebede Feyisa , Zenaw Debasu Addisu , Tirsit Ketsela Zeleke , Ebrahim Abdela Siraj , Minichil Chanie Worku , Fasikaw Berihun","doi":"10.1016/j.ijmedinf.2025.105858","DOIUrl":"10.1016/j.ijmedinf.2025.105858","url":null,"abstract":"<div><h3>Aims</h3><div>Several Artificial Intelligence (AI) based predictive tools have been developed to predict non-adherence among patients with type 2 diabetes (T2D). Hence, this study aimed to describe and evaluate the methodological quality of AI based predictive tools for identifying T2D patients at high risk of treatment non-adherence.</div></div><div><h3>Methods</h3><div>A systematic search was conducted across multiple databases including, EMBASE, Cochrane Library, MedLine, and Google Scholar search. The Prediction model Risk Of Bias ASsessment Tool (PROBAST) was used to assess the quality of studies. The performances of tools were assessed by Area Under the Curve (AUC), precision, recall, C-index, accuracy, sensitivity, specificity or F1 score.</div></div><div><h3>Results</h3><div>Most studies measured predictive ability using AUC (75 %), and some only reported precision (25 %), recall (12.5 %), C-index (12.5 %), accuracy (37.5), sensitivity (12.5 %), specificity (12.5 %) or F1 score (25 %). All tools had moderate to high predictive ability (AUC > 0.70). However, only one study conducted external validation. Demographic characteristics, HbA1c, glucose monitoring data, and treatment details were typical factors used in developing tools.</div></div><div><h3>Conclusions</h3><div>The existing AI based tools holds significant promise for improving diabetes care. However, future studies should focus on refining the existing tools, validating in other settings, and evaluating the cost-effectiveness of AI-supported interventions.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"198 ","pages":"Article 105858"},"PeriodicalIF":3.7,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI/ML advancements have been significant, yet their deployment in clinical practice faces logistical, regulatory, and trust-related challenges. To promote trust and informed use of ML predictions in real-world scenarios, reliable assessment of individual predictions is essential. We propose RelAI, a tool for pointwise reliability assessment of ML predictions that can support the identification of prediction errors during deployment.
Materials and Methods
RelAI utilizes Autoencoders (AEs) to detect distributional shifts (Density principle) and a proxy model to encode local performance (Local Fit principle). We validated RelAI on a synthetic dataset and a real-world scenario involving Multiple Sclerosis (MS) patient outcomes.
Results
On a synthetic dataset, RelAI effectively identified unreliable predictions, outperforming alternative approaches. In the MS case study, reliable predictions exhibited higher accuracy and were associated with specific demographic features, such as sex, residence, and eye symptoms.
Discussion and Conclusion
RelAI can support ML deployment in clinical settings by providing pointwise reliability assessments, ensuring regulatory compliance, and fostering user trust. Its model-agnostic nature and its compatibility with Python-based ML pipelines enhance its potential for widespread adoption.
{"title":"RelAI: an automated approach to judge pointwise ML prediction reliability","authors":"Lorenzo Peracchio , Giovanna Nicora , Enea Parimbelli , Tommaso Mario Buonocore , Eleonora Tavazzi , Roberto Bergamaschi , Arianna Dagliati , Riccardo Bellazzi","doi":"10.1016/j.ijmedinf.2025.105857","DOIUrl":"10.1016/j.ijmedinf.2025.105857","url":null,"abstract":"<div><h3>Objectives</h3><div>AI/ML advancements have been significant, yet their deployment in clinical practice faces logistical, regulatory, and trust-related challenges. To promote trust and informed use of ML predictions in real-world scenarios, reliable assessment of individual predictions is essential. We propose RelAI, a tool for pointwise reliability assessment of ML predictions that can support the identification of prediction errors during deployment.</div></div><div><h3>Materials and Methods</h3><div>RelAI utilizes Autoencoders (AEs) to detect distributional shifts (Density principle) and a proxy model to encode local performance (Local Fit principle). We validated RelAI on a synthetic dataset and a real-world scenario involving Multiple Sclerosis (MS) patient outcomes.</div></div><div><h3>Results</h3><div>On a synthetic dataset, RelAI effectively identified unreliable predictions, outperforming alternative approaches. In the MS case study, reliable predictions exhibited higher accuracy and were associated with specific demographic features, such as sex, residence, and eye symptoms.</div></div><div><h3>Discussion and Conclusion</h3><div>RelAI can support ML deployment in clinical settings by providing pointwise reliability assessments, ensuring regulatory compliance, and fostering user trust. Its model-agnostic nature and its compatibility with Python-based ML pipelines enhance its potential for widespread adoption.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105857"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143528721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-23DOI: 10.1016/j.ijmedinf.2025.105844
Andrew Vickers , Alexander Hollingsworth , Anthony Bozzo , Avijit Chatterjee , Subrata Chatterjee
Net benefit is the most widely used metric for evaluating the clinical utility of medical prediction models. The approach applies decision analytic theory to weight true and false positives depending on the relative consequences of different decision outcomes. It is plausible that there are at least some machine learning scenarios where optimization of the objective function during model development will not optimize net benefit during model evaluation. We therefore hypothesize that optimizing net benefit during model development will in some cases ultimately lead to higher clinical utility than optimizing for mean square error or some other unweighted loss function. There is some preliminary evidence that this does indeed occur. We accordingly recommend further methodologic research to determine the use cases where net benefit should be the objective function during model development.
{"title":"Hypothesis: Net benefit as an objective function during development of machine learning algorithms for medical applications","authors":"Andrew Vickers , Alexander Hollingsworth , Anthony Bozzo , Avijit Chatterjee , Subrata Chatterjee","doi":"10.1016/j.ijmedinf.2025.105844","DOIUrl":"10.1016/j.ijmedinf.2025.105844","url":null,"abstract":"<div><div>Net benefit is the most widely used metric for evaluating the clinical utility of medical prediction models. The approach applies decision analytic theory to weight true and false positives depending on the relative consequences of different decision outcomes. It is plausible that there are at least some machine learning scenarios where optimization of the objective function during model development will not optimize net benefit during model evaluation. We therefore hypothesize that optimizing net benefit during model development will in some cases ultimately lead to higher clinical utility than optimizing for mean square error or some other unweighted loss function. There is some preliminary evidence that this does indeed occur. We accordingly recommend further methodologic research to determine the use cases where net benefit should be the objective function during model development.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105844"},"PeriodicalIF":3.7,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-22DOI: 10.1016/j.ijmedinf.2025.105845
Huixiu Hu , Yajie Zhao , Chao Sun , Quanying Wu , Ying Deng , Jie Liu
Background
The 30-day hospital readmission rate is a key indicator of healthcare quality and system efficiency. This study aimed to develop machine-learning (ML) models to predict unplanned 30-day readmissions in older patients with ischemic stroke (IS) using a prospective cohort design.
Methods
Patients were divided into two datasets: dataset I (January 2020–December 2021) for model development and dataset II (January 2022–December 2023) for validation. A diffusion model was applied to address data imbalance. Eleven machine-learning methods, including Random Forest (RF), Logistic Regression, CatBoost, eXtreme Gradient Boosting Light Gradient Boosting Machine, K-Nearest Neighbors Support Vector Machine, Multi-Layer Perceptron, and Gaussian Naive Bayes, and 2 ensemble learning models, were constructed to predict readmissions. Bayesian optimization was used to fine-tune the hyperparameters of these models. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUC). Shapley Additive Explanations (SHAP) were utilized to identify and interpret the significance of predictive variables.
Results
Dataset I included 489 patients, while dataset II comprised 418 patients, with readmission rates of 15.3 % and 16.0 %, respectively. The RF model achieved the highest predictive performance (AUC = 0.9116, sensitivity = 0.8806, specificity = 0.7806). SHAP analysis identified readiness for hospital discharge as the most significant predictor of readmission.
Conclusion
The RF model shows promise for predicting unplanned 30-day readmissions in older patients with IS. Multi-center studies with larger sample sizes are needed to validate these findings.
{"title":"Enhancing readmission prediction model in older stroke patients by integrating insight from readiness for hospital discharge: Prospective cohort study","authors":"Huixiu Hu , Yajie Zhao , Chao Sun , Quanying Wu , Ying Deng , Jie Liu","doi":"10.1016/j.ijmedinf.2025.105845","DOIUrl":"10.1016/j.ijmedinf.2025.105845","url":null,"abstract":"<div><h3>Background</h3><div>The 30-day hospital readmission rate is a key indicator of healthcare quality and system efficiency. This study aimed to develop machine-learning (ML) models to predict unplanned 30-day readmissions in older patients with ischemic stroke (IS) using a prospective cohort design.</div></div><div><h3>Methods</h3><div>Patients were divided into two datasets: dataset I (January 2020–December 2021) for model development and dataset II (January 2022–December 2023) for validation. A diffusion model was applied to address data imbalance. Eleven machine-learning methods, including Random Forest (RF), Logistic Regression, CatBoost, eXtreme Gradient Boosting Light Gradient Boosting Machine, K-Nearest Neighbors Support Vector Machine, Multi-Layer Perceptron, and Gaussian Naive Bayes, and 2 ensemble learning models, were constructed to predict readmissions. Bayesian optimization was used to fine-tune the hyperparameters of these models. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUC). Shapley Additive Explanations (SHAP) were utilized to identify and interpret the significance of predictive variables.</div></div><div><h3>Results</h3><div>Dataset I included 489 patients, while dataset II comprised 418 patients, with readmission rates of 15.3 % and 16.0 %, respectively. The RF model achieved the highest predictive performance (AUC = 0.9116, sensitivity = 0.8806, specificity = 0.7806). SHAP analysis identified readiness for hospital discharge as the most significant predictor of readmission.</div></div><div><h3>Conclusion</h3><div>The RF model shows promise for predicting unplanned 30-day readmissions in older patients with IS. Multi-center studies with larger sample sizes are needed to validate these findings.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105845"},"PeriodicalIF":3.7,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing use of Deep Learning (DL) in healthcare has highlighted the critical need for improved transparency and interpretability. While Explainable Artificial Intelligence (XAI) methods provide insights into model predictions, reliability cannot be guaranteed by simply relying on explanations.
Objectives
This position paper proposes the integration of Uncertainty Quantification (UQ) with XAI methods to improve model reliability and trustworthiness in healthcare applications.
Methods
We examine state-of-the-art XAI and UQ techniques, discuss implementation challenges, and suggest solutions to combine UQ with XAI methods. We propose a framework for estimating both aleatoric and epistemic uncertainty in the XAI context, providing illustrative examples of their potential application.
Results
Our analysis indicates that integrating UQ with XAI could significantly enhance the reliability of DL models in practice. This approach has the potential to reduce interpretation biases and over-reliance, leading to more cautious and conscious use of AI in healthcare.
{"title":"Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare","authors":"Massimo Salvi , Silvia Seoni , Andrea Campagner , Arkadiusz Gertych , U.Rajendra Acharya , Filippo Molinari , Federico Cabitza","doi":"10.1016/j.ijmedinf.2025.105846","DOIUrl":"10.1016/j.ijmedinf.2025.105846","url":null,"abstract":"<div><h3>Background</h3><div>The increasing use of Deep Learning (DL) in healthcare has highlighted the critical need for improved transparency and interpretability. While Explainable Artificial Intelligence (XAI) methods provide insights into model predictions, reliability cannot be guaranteed by simply relying on explanations.</div></div><div><h3>Objectives</h3><div>This position paper proposes the integration of Uncertainty Quantification (UQ) with XAI methods to improve model reliability and trustworthiness in healthcare applications.</div></div><div><h3>Methods</h3><div>We examine state-of-the-art XAI and UQ techniques, discuss implementation challenges, and suggest solutions to combine UQ with XAI methods. We propose a framework for estimating both aleatoric and epistemic uncertainty in the XAI context, providing illustrative examples of their potential application.</div></div><div><h3>Results</h3><div>Our analysis indicates that integrating UQ with XAI could significantly enhance the reliability of DL models in practice. This approach has the potential to reduce interpretation biases and over-reliance, leading to more cautious and conscious use of AI in healthcare.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105846"},"PeriodicalIF":3.7,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-19DOI: 10.1016/j.ijmedinf.2025.105843
Hak Seung Lee , Ga In Han , Kyung-Hee Kim , Sora Kang , Jong-Hwan Jang , Yong-Yeon Jo , Jeong Min Son , Min Sung Lee , Joon-myoung Kwon , Byung-Hee Oh
Background
Despite the proliferation of heart failure (HF) mortality prediction models, their practical utility is limited. Addressing this, we utilized a significant dataset to develop and validate a deep learning artificial intelligence (AI) model for predicting one-year mortality in heart failure with reduced ejection fraction (HFrEF) patients. The study’s focus was to assess the effectiveness of an AI algorithm, trained on an extensive collection of ECG data, in predicting one-year mortality in HFrEF patients.
Methods
We selected HFrEF patients who had high-quality baseline ECGs from two hospital visits between September 2016 and May 2021. A total of 3,894 HFrEF patients (64% male, mean age 64.3, mean ejection fraction 29.8%) were included. Using this ECG data, we developed a deep learning model and evaluated its performance using the area under the receiver operating characteristic curve (AUROC).
Results
The model, validated against 16,228 independent ECGs from the original cohort, achieved an AUROC of 0.826 (95 % CI, 0.794–0.859). It displayed a high sensitivity of 99.0 %, positive predictive value of 16.6 %, and negative predictive value of 98.4 %. Importantly, the deep learning algorithm emerged as an independent predictor of 1-yr mortality of HFrEF patients with an adjusted hazards ratio of 4.12 (95 % CI 2.32–7.33, p < 0.001).
Conclusions
The depth and quality of our dataset and our AI-driven ECG analysis model significantly enhance the prediction of one-year mortality in HFrEF patients. This promises a more personalized, future-focused approach in HF patient management.
{"title":"Electrocardiographic-Driven artificial intelligence Model: A new approach to predicting One-Year mortality in heart failure with reduced ejection fraction patients","authors":"Hak Seung Lee , Ga In Han , Kyung-Hee Kim , Sora Kang , Jong-Hwan Jang , Yong-Yeon Jo , Jeong Min Son , Min Sung Lee , Joon-myoung Kwon , Byung-Hee Oh","doi":"10.1016/j.ijmedinf.2025.105843","DOIUrl":"10.1016/j.ijmedinf.2025.105843","url":null,"abstract":"<div><h3>Background</h3><div>Despite the proliferation of heart failure (HF) mortality prediction models, their practical utility is limited. Addressing this, we utilized a significant dataset to develop and validate a deep learning artificial intelligence (AI) model for predicting one-year mortality in heart failure with reduced ejection fraction (HFrEF) patients. The study’s focus was to assess the effectiveness of an AI algorithm, trained on an extensive collection of ECG data, in predicting one-year mortality in HFrEF patients.</div></div><div><h3>Methods</h3><div>We selected HFrEF patients who had high-quality baseline ECGs from two hospital visits between September 2016 and May 2021. A total of 3,894 HFrEF patients (64% male, mean age 64.3, mean ejection fraction 29.8%) were included. Using this ECG data, we developed a deep learning model and evaluated its performance using the area under the receiver operating characteristic curve (AUROC).</div></div><div><h3>Results</h3><div>The model, validated against 16,228 independent ECGs from the original cohort, achieved an AUROC of 0.826 (95 % CI, 0.794–0.859). It displayed a high sensitivity of 99.0 %, positive predictive value of 16.6 %, and negative predictive value of 98.4 %. Importantly, the deep learning algorithm emerged as an independent predictor of 1-yr mortality of HFrEF patients with an adjusted hazards ratio of 4.12 (95 % CI 2.32–7.33, p < 0.001).</div></div><div><h3>Conclusions</h3><div>The depth and quality of our dataset and our AI-driven ECG analysis model significantly enhance the prediction of one-year mortality in HFrEF patients. This promises a more personalized, future-focused approach in HF patient management.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105843"},"PeriodicalIF":3.7,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-19DOI: 10.1016/j.ijmedinf.2025.105841
Aurélia Manns , Thomas Pezziardi , Natacha Kadlub , Anita Burgun , Alban Destrez , Rosy Tsopra
Background
The digital transition has changed the practice of exchanging patient medical information between health professionals. Challenges include the involvement of multiple professionals with varying communication styles, the exponential growth of diverse data types, interoperability issues due to non-integrated tools, and heightened security risks stemming from the use of unsecured applications and personal devices.
Here, we aimed to understand how to help health surgeons to better consider security during data exchange.
Methods
We conducted a qualitative research with 20 interviews with surgeons working in wards of several French institutions. The verbatims were analyzed manually by two researchers using an iterative thematic approach, resulting in a framework to improve practitioners’ security awareness.
Results
Our findings emphasize the necessity of a multifaceted strategy, as a single secure application is not sufficient. Effective solutions require combining tailored digital tools with educational initiatives and institutional support. The proposed application must meet specific requirements; and simultaneously, hospitals must provide clear regulations, financial investment, and continuous support to reduce professional constraints.
Conclusion
This study underscores the need for a holistic approach, spanning education, institutional backing, and advanced technology, to enhance data security in healthcare. Future studies could extend our framework by considering other healthcare settings and patient perspectives.
{"title":"Enhancing security in patient medical information exchange: A qualitative study","authors":"Aurélia Manns , Thomas Pezziardi , Natacha Kadlub , Anita Burgun , Alban Destrez , Rosy Tsopra","doi":"10.1016/j.ijmedinf.2025.105841","DOIUrl":"10.1016/j.ijmedinf.2025.105841","url":null,"abstract":"<div><h3>Background</h3><div>The digital transition has changed the practice of exchanging patient medical information between health professionals. Challenges include the involvement of multiple professionals with varying communication styles, the exponential growth of diverse data types, interoperability issues due to non-integrated tools, and heightened security risks stemming from the use of unsecured applications and personal devices.</div><div>Here, we aimed to understand how to help health surgeons to better consider security during data exchange.</div></div><div><h3>Methods</h3><div>We conducted a qualitative research with 20 interviews with surgeons working in wards of several French institutions. The verbatims were analyzed manually by two researchers using an iterative thematic approach, resulting in a framework to improve practitioners’ security awareness.</div></div><div><h3>Results</h3><div>Our findings emphasize the necessity of a multifaceted strategy, as a single secure application is not sufficient. Effective solutions require combining tailored digital tools with educational initiatives and institutional support. The proposed application must meet specific requirements; and simultaneously, hospitals must provide clear regulations, financial investment, and continuous support to reduce professional constraints.</div></div><div><h3>Conclusion</h3><div>This study underscores the need for a holistic approach, spanning education, institutional backing, and advanced technology, to enhance data security in healthcare. Future studies could extend our framework by considering other healthcare settings and patient perspectives.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105841"},"PeriodicalIF":3.7,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1016/j.ijmedinf.2025.105836
Hong Wu, Mingyu Li, Li Zhang
Introduction
During influenza season, some patients tend to seek medical advice through online platforms. However, due to time constraints, the informational and emotional support provided by physicians is limited. Large language models (LLMs) can rapidly provide medical knowledge and empathy, but their capacity for providing informational support to patients with influenza and assisting physicians in providing emotional support is unclear. Therefore, this study evaluated the quality of LLM-generated influenza advice and its emotional support performance in comparison with physician advice.
Methods
This study utilized 200 influenza question–answer pairs from the online health community. Data collection consisted of two parts: (1) A panel of board-certified physicians evaluated the quality of LLM advice vs physician advice. (2) Physician advice was polished using an LLM, and the LLM-rewritten advice was compared to the original physician advice using the LLM module.
Results
For informational support, there was no significant difference between LLM and physician advice in terms of the presence of incorrect information, omission of information, extent of harm or empathy. Nevertheless, compared to physician advice, LLM advice was more likely to cause harm and to be in line with medical consensus. LLM was also able to assist physicians in providing emotional support, since the LLM-rewritten advice was significantly more respectful, friendly and empathetic, when compared with physician advice. Also, the LLM-rewritten advice was logically smooth. In most cases, LLM did not add or omit the original medical information.
Conclusion
This study suggests that LLMs can provide informational and emotional support for influenza patients. This may help to alleviate the pressure on physicians and promote physician-patient communication.
{"title":"Comparing physician and large language model responses to influenza patient questions in the online health community","authors":"Hong Wu, Mingyu Li, Li Zhang","doi":"10.1016/j.ijmedinf.2025.105836","DOIUrl":"10.1016/j.ijmedinf.2025.105836","url":null,"abstract":"<div><h3>Introduction</h3><div>During influenza season, some patients tend to seek medical advice through online platforms. However, due to time constraints, the informational and emotional support provided by physicians is limited. Large language models (LLMs) can rapidly provide medical knowledge and empathy, but their capacity for providing informational support to patients with influenza and assisting physicians in providing emotional support is unclear. Therefore, this study evaluated the quality of LLM-generated influenza advice and its emotional support performance in comparison with physician advice.</div></div><div><h3>Methods</h3><div>This study utilized 200 influenza question–answer pairs from the online health community. Data collection consisted of two parts: (1) A panel of board-certified physicians evaluated the quality of LLM advice vs physician advice. (2) Physician advice was polished using an LLM, and the LLM-rewritten advice was compared to the original physician advice using the LLM module.</div></div><div><h3>Results</h3><div>For informational support, there was no significant difference between LLM and physician advice in terms of the presence of incorrect information, omission of information, extent of harm or empathy. Nevertheless, compared to physician advice, LLM advice was more likely to cause harm and to be in line with medical consensus. LLM was also able to assist physicians in providing emotional support, since the LLM-rewritten advice was significantly more respectful, friendly and empathetic, when compared with physician advice. Also, the LLM-rewritten advice was logically smooth. In most cases, LLM did not add or omit the original medical information.</div></div><div><h3>Conclusion</h3><div>This study suggests that LLMs can provide informational and emotional support for influenza patients. This may help to alleviate the pressure on physicians and promote physician-patient communication.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105836"},"PeriodicalIF":3.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liver disease accounts for 4 % of global mortality. The advent of mobile technology has introduced a novel domain in liver disease management. Identifying effective mobile apps with pertinent information on liver diseases is essential. This study seeks to evaluate liver disease-related mobile applications using the Mobile Application Rating Scale (MARS) quality assessment tool.
Method
This research employs a cross-sectional descriptive and analytical methodology focusing on liver disease-related mobile applications. We evaluated all Persian and English mobile applications available on the Google Play, Cafe Bazaar, and Myket Stores dedicated to liver diseases until 2023. After eliminating duplicates, evaluators extracted technical specifications and features of apps. The MARS was employed to assess the quality of the mobile applications. Both statistical and machine learning methods were employed for analysis.
Results
A total of 2,044 mobile applications were identified, with 49 selected for final analysis. The apps focused on liver-related issues included general liver disease (n = 20, 40.82 %), hepatitis (n = 9, 18.37 %), and fatty liver disease (n = 8, 16.33 %). In terms of functionality, the majority of apps (n = 20, 40.82 %) served as calculators, with 15 specifically for calculation. Among these, three integrated educational elements, and two also supported diet and fitness alongside calculator functions. Additionally, 20 apps aimed to provide educational and informative content. The average quality score was 3.17 (SD = 0.20), with scores ranging from 2.33 to 4.45. Generally, the mean score of Engagement, Functionality, Aesthetics and Information were 4.20 (SD = 0.67), 4.00 (SD = 0.67), 4.00 (SD = 0.92), and 4.00 (SD = 0.67), respectively. The highest Subjective quality score was 4.75.
Conclusions
Liver disease-related mobile applications serve users in educational, diet and lifestyle, calculation, risk assessment, and management domains, focusing mainly on general liver diseases and hepatitis. However, the results revealed that the apps lack sufficient and reliable information.
{"title":"Quality review and content analysis of liver complications mobile apps in Iran: A statistical and machine learning approach","authors":"Farzaneh Kermani , Mahdi Mahmoodi , Mahmood Reza Nasiri , Azam Orooji","doi":"10.1016/j.ijmedinf.2025.105842","DOIUrl":"10.1016/j.ijmedinf.2025.105842","url":null,"abstract":"<div><h3>Background</h3><div>Liver disease accounts for 4 % of global mortality. The advent of mobile technology has introduced a novel domain in liver disease management. Identifying effective mobile apps with pertinent information on liver diseases is essential. This study seeks to evaluate liver disease-related mobile applications using the Mobile Application Rating Scale (MARS) quality assessment tool.</div></div><div><h3>Method</h3><div>This research employs a cross-sectional descriptive and analytical methodology focusing on liver disease-related mobile applications. We evaluated all Persian and English mobile applications available on the <em>Google Play</em>, <em>Cafe Bazaar</em>, and <em>Myket</em> Stores dedicated to liver diseases until 2023. After eliminating duplicates, evaluators extracted technical specifications and features of apps. The MARS was employed to assess the quality of the mobile applications. Both statistical and machine learning methods were employed for analysis.</div></div><div><h3>Results</h3><div>A total of 2,044 mobile applications were identified, with 49 selected for final analysis. The apps focused on liver-related issues included general liver disease (n = 20, 40.82 %), hepatitis (n = 9, 18.37 %), and fatty liver disease (n = 8, 16.33 %). In terms of functionality, the majority of apps (n = 20, 40.82 %) served as calculators, with 15 specifically for calculation. Among these, three integrated educational elements, and two also supported diet and fitness alongside calculator functions. Additionally, 20 apps aimed to provide educational and informative content. The average quality score was 3.17 (SD = 0.20), with scores ranging from 2.33 to 4.45. Generally, the mean score of Engagement, Functionality, Aesthetics and Information were 4.20 (SD = 0.67), 4.00 (SD = 0.67), 4.00 (SD = 0.92), and 4.00 (SD = 0.67), respectively. The highest Subjective quality score was 4.75.</div></div><div><h3>Conclusions</h3><div>Liver disease-related mobile applications serve users in educational, diet and lifestyle, calculation, risk assessment, and management domains, focusing mainly on general liver diseases and hepatitis. However, the results revealed that the apps lack sufficient and reliable information.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105842"},"PeriodicalIF":3.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-16DOI: 10.1016/j.ijmedinf.2025.105840
Ruba Sulaiman , Md.Ahasan Atick Faisal , Maram Hasan , Muhammad E.H. Chowdhury , Faycal Bensaali , Abdulrahman Alnabti , Huseyin C. Yalcin
Background
Transcatheter aortic valve implantation (TAVI) therapy has demonstrated its clear benefits such as low invasiveness, to treat aortic stenosis. Despite associated benefits, still post-procedural complications might occur. The severity of these complications depends on pre-existing clinical conditions and patient specific complex anatomical features. Accurate prediction of TAVI outcomes will assist in the precise risk assessment for patients undergoing TAVI. Throughout the past decade, different machine learning (ML) approaches have been utilized to predict outcomes of TAVI. This systematic review aims to assess the application of ML in TAVI for the purpose of outcome prediction.
Methods
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was adapted for searching the PubMed and Scopus databases on ML use in TAVI outcomes prediction. Once the studies that meet the inclusion criteria were identified, data from these studies were retrieved and were further examined. 17 parameters relevant to TAVI outcomes were carefully identified for assessing the quality of the included studies.
Results
Following the search of the mentioned databases, 78 studies were initially retrieved, and 17 of these studies were included for further assessment. Most of the included studies focused on mortality prediction, utilizing datasets of varying sizes and diverse ML algorithms. The most employed ML algorithms were random forest, logistics regression, and gradient boosting. Among the studied parameters, serum creatinine, age, BMI, hemoglobin, and aortic valve mean gradient were identified as key predictors for TAVI outcomes. These predictors were found to be well aligned with established associations in current literature.
Conclusion
ML presents a promising opportunity for improving the success and safety of TAVI and enhancing patient-centered care. While currently retrospective studies with low generalizability and heterogeneity form the basis of ML TAVI research, future prospective investigations with highly heterogeneous patient TAVI cohorts will be critically important for firmly establishing the applicability of ML in predicting TAVI outcomes.
{"title":"Machine learning for predicting outcomes of transcatheter aortic valve implantation: A systematic review","authors":"Ruba Sulaiman , Md.Ahasan Atick Faisal , Maram Hasan , Muhammad E.H. Chowdhury , Faycal Bensaali , Abdulrahman Alnabti , Huseyin C. Yalcin","doi":"10.1016/j.ijmedinf.2025.105840","DOIUrl":"10.1016/j.ijmedinf.2025.105840","url":null,"abstract":"<div><h3>Background</h3><div>Transcatheter aortic valve implantation (TAVI) therapy has demonstrated its clear benefits such as low invasiveness, to treat aortic stenosis. Despite associated benefits, still post-procedural complications might occur. The severity of these complications depends on pre-existing clinical conditions and patient specific complex anatomical features. Accurate prediction of TAVI outcomes will assist in the precise risk assessment for patients undergoing TAVI. Throughout the past decade, different machine learning (ML) approaches have been utilized to predict outcomes of TAVI. This systematic review aims to assess the application of ML in TAVI for the purpose of outcome prediction<strong>.</strong></div></div><div><h3>Methods</h3><div>Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was adapted for searching the PubMed and Scopus databases on ML use in TAVI outcomes prediction. Once the studies that meet the inclusion criteria were identified, data from these studies were retrieved and were further examined. 17 parameters relevant to TAVI outcomes were carefully identified for assessing the quality of the included studies.</div></div><div><h3>Results</h3><div>Following the search of the mentioned databases, 78 studies were initially retrieved, and 17 of these studies were included for further assessment. Most of the included studies focused on mortality prediction, utilizing datasets of varying sizes and diverse ML algorithms. The most employed ML algorithms were random forest, logistics regression, and gradient boosting. Among the studied parameters, serum creatinine, age, BMI, hemoglobin, and aortic valve mean gradient were identified as key predictors for TAVI outcomes. These predictors were found to be well aligned with established associations in current literature.</div></div><div><h3>Conclusion</h3><div>ML presents a promising opportunity for improving the success and safety of TAVI and enhancing patient-centered care. While currently retrospective studies with low generalizability and heterogeneity form the basis of ML TAVI research, future prospective investigations with highly heterogeneous patient TAVI cohorts will be critically important for firmly establishing the applicability of ML in predicting TAVI outcomes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"197 ","pages":"Article 105840"},"PeriodicalIF":3.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}