首页 > 最新文献

International Journal of Medical Informatics最新文献

英文 中文
Development and validation of data-driven, decision tree–based algorithms for identifying Behçet’s disease in claims data 开发和验证数据驱动,决策树为基础的算法识别behet的疾病索赔数据。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-05 DOI: 10.1016/j.ijmedinf.2026.106266
Ken-ei Sada , Yoshia Miyawaki , Ryo Yanai , Takashi Kida , Akira Onishi , Ryusuke Yoshimi , Kunihiro Ichinose , Yasuhiro Shimojima

Objective

To develop and externally validate novel, data-driven algorithms that are based on appropriate variable selection methods for identifying patients with Behçet’s disease in Japan.

Methods

This retrospective cross-sectional study included 13,538 patients from six tertiary hospitals (November–December 2023). One year of claims data was linked to chart-confirmed Behçet’s disease diagnoses. Patients were randomly divided into training (n = 8,811) and test (n = 3,775) sets, with external validation (n = 952) from another hospital. Feature selection among Behçet’s disease-coded patients used the Least Absolute Shrinkage and Selection Operator, Boruta, and Recursive Feature Elimination. The diagnostic performance of the rule-based algorithms, which were derived from the decision tree models, was evaluated using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score.

Results

Diagnosis codes alone achieved high sensitivity (1.000) and specificity (0.992) but modest PPV (0.767, test set; 0.850, external validation). Incorporating sulphamethoxazole–trimethoprim and colchicine prescriptions improved the positive predictive value, which was 0.793 in the test set and 0.865 in external validation.

Conclusion

Incorporating prescriptions alongside diagnosis codes improved PPV while maintaining high sensitivity and specificity. Building upon a data-driven framework that integrates variable selection methods and decision tree analysis, this study provides a validated and scalable approach for reliable claims-based research on Behçet’s disease.
目的:开发并外部验证基于适当变量选择方法的新型数据驱动算法,用于识别日本behet病患者。方法:回顾性横断面研究纳入6家三级医院(2023年11 - 12月)13538例患者。一年的索赔数据与图表确认的behaperet疾病诊断相关联。患者随机分为训练组(n = 8,811)和测试组(n = 3,775),其中外部验证组(n = 952)来自其他医院。使用最小绝对收缩和选择算子、Boruta和递归特征消除对behaperet病编码患者进行特征选择。基于规则的算法的诊断性能由决策树模型衍生而来,通过准确性、敏感性、特异性、阳性预测值(PPV)、阴性预测值和F1评分来评估。结果:单独诊断代码具有较高的灵敏度(1.000)和特异性(0.992),但PPV较低(0.767,试验集;0.850,外部验证)。联用磺胺甲氧嘧啶-甲氧苄啶和秋水仙碱提高了阳性预测值,试验集为0.793,外部验证为0.865。结论:结合处方和诊断代码可改善PPV,同时保持较高的敏感性和特异性。本研究建立在数据驱动的框架上,整合了变量选择方法和决策树分析,为可靠的基于索赔的behet病研究提供了一种经过验证和可扩展的方法。
{"title":"Development and validation of data-driven, decision tree–based algorithms for identifying Behçet’s disease in claims data","authors":"Ken-ei Sada ,&nbsp;Yoshia Miyawaki ,&nbsp;Ryo Yanai ,&nbsp;Takashi Kida ,&nbsp;Akira Onishi ,&nbsp;Ryusuke Yoshimi ,&nbsp;Kunihiro Ichinose ,&nbsp;Yasuhiro Shimojima","doi":"10.1016/j.ijmedinf.2026.106266","DOIUrl":"10.1016/j.ijmedinf.2026.106266","url":null,"abstract":"<div><h3>Objective</h3><div>To develop and externally validate novel, data-driven algorithms that are based on appropriate variable selection methods for identifying patients with Behçet’s disease in Japan.</div></div><div><h3>Methods</h3><div>This retrospective cross-sectional study included 13,538 patients from six tertiary hospitals (November–December 2023). One year of claims data was linked to chart-confirmed Behçet’s disease diagnoses. Patients were randomly divided into training (n = 8,811) and test (n = 3,775) sets, with external validation (n = 952) from another hospital. Feature selection among Behçet’s disease-coded patients used the Least Absolute Shrinkage and Selection Operator, Boruta, and Recursive Feature Elimination. The diagnostic performance of the rule-based algorithms, which were derived from the decision tree models, was evaluated using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score.</div></div><div><h3>Results</h3><div>Diagnosis codes alone achieved high sensitivity (1.000) and specificity (0.992) but modest PPV (0.767, test set; 0.850, external validation). Incorporating sulphamethoxazole–trimethoprim and colchicine prescriptions improved the positive predictive value, which was 0.793 in the test set and 0.865 in external validation.</div></div><div><h3>Conclusion</h3><div>Incorporating prescriptions alongside diagnosis codes improved PPV while maintaining high sensitivity and specificity. Building upon a data-driven framework that integrates variable selection methods and decision tree analysis, this study provides a validated and scalable approach for reliable claims-based research on Behçet’s disease.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106266"},"PeriodicalIF":4.1,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of electronic health record to HL7® FHIR® mappings in pediatric research studies 儿科研究中电子健康记录与HL7®FHIR®映射的评价
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-04 DOI: 10.1016/j.ijmedinf.2026.106265
Maryam Y. Garza , Zhan Wang , Bhargav Adagarla , Michael W. Rutherford , Umit Topaloglu , Daniel K. Benjamin , Kanecia O. Zimmerman , Eric L. Eisenstein , Karan R. Kumar , on behalf of the Best Pharmaceuticals for Children Act – Pediatric Trials Network Steering Committee

Background

eSource technologies that exchange patient data from electronic health records (EHR) to clinical study electronic data capture (EDC) systems can reduce data quality errors and decrease data collection time. However, the availability of site-specific EHR data to support pediatric studies has not been evaluated.

Methods

We used a previously developed data element mapping procedure to evaluate the HL7® FHIR® standard’s coverage in multi-center pediatric clinical studies. Four study sites independently mapped three pediatric studies’ case report forms (CRFs) to their site’s EHR and FHIR® server data elements.

Results

Site investigators mapped 4152 total and 2070 distinct data elements. Only 33.8 % of total CRF data elements (n = 1402) and 27.4 % of distinct data elements (n = 568) were able to be mapped in FHIR® at the four sites. However, the percent of total data elements mapped varied by pediatric study (55.3 %, 30.8 %, and 26.2 %) and study site (46.4 %, 32.3 %, 27.8 %, and 26.6 %). The percent of total CRF data elements mapped was higher in domains containing standard of care data (e.g., Concomitant Medications, Demographics, Diagnosis/Procedures, Medical History, and Vital Signs) and lower in domains containing protocol-specific data (e.g., Adverse Events, Concomitant Medications, Enrollment/Eligibility/Consent, and study treatment-related Labs, and Vital Signs).

Conclusions

There is substantial between-study and between-site variability in the percentage of pediatric study data elements available in FHIR® at study sites. These results suggest that mapping solutions for pediatric studies utilizing eSource technologies will need to be site-specific.
背景:将患者数据从电子健康记录(EHR)交换到临床研究电子数据捕获(EDC)系统的资源技术可以减少数据质量错误并缩短数据收集时间。然而,支持儿科研究的特定地点电子病历数据的可用性尚未得到评估。方法:我们使用先前开发的数据元素映射程序来评估HL7®FHIR®标准在多中心儿科临床研究中的覆盖范围。四个研究站点独立地将三个儿科研究病例报告表格(CRFs)映射到其站点的EHR和FHIR®服务器数据元素。结果:现场调查人员共绘制了4152个数据元素和2070个不同的数据元素。只有33.8%的总CRF数据元素(n = 1402)和27.4%的不同数据元素(n = 568)能够在四个位点的FHIR®中被映射。然而,总数据元素映射的百分比因儿科研究(55.3%、30.8%和26.2%)和研究地点(46.4%、32.3%、27.8%和26.6%)而异。总CRF数据元素的百分比在包含护理标准数据(例如,伴随用药、人口统计学、诊断/程序、病史和生命体征)的领域较高,而在包含特定方案数据(例如,不良事件、伴随用药、入组/资格/同意、研究治疗相关实验室和生命体征)的领域较低。结论:在FHIR®的研究地点中,儿科研究数据元素的百分比在研究之间和研究地点之间存在很大的差异。这些结果表明,利用eSource技术进行儿科研究的制图解决方案需要针对具体地点。
{"title":"Evaluation of electronic health record to HL7® FHIR® mappings in pediatric research studies","authors":"Maryam Y. Garza ,&nbsp;Zhan Wang ,&nbsp;Bhargav Adagarla ,&nbsp;Michael W. Rutherford ,&nbsp;Umit Topaloglu ,&nbsp;Daniel K. Benjamin ,&nbsp;Kanecia O. Zimmerman ,&nbsp;Eric L. Eisenstein ,&nbsp;Karan R. Kumar ,&nbsp;on behalf of the Best Pharmaceuticals for Children Act – Pediatric Trials Network Steering Committee","doi":"10.1016/j.ijmedinf.2026.106265","DOIUrl":"10.1016/j.ijmedinf.2026.106265","url":null,"abstract":"<div><h3>Background</h3><div>eSource technologies that exchange patient data from electronic health records (EHR) to clinical study electronic data capture (EDC) systems can reduce data quality errors and decrease data collection time. However, the availability of site-specific EHR data to support pediatric studies has not been evaluated.</div></div><div><h3>Methods</h3><div>We used a previously developed data element mapping procedure to evaluate the HL7® FHIR® standard’s coverage in multi-center pediatric clinical studies. Four study sites independently mapped three pediatric studies’ case report forms (CRFs) to their site’s EHR and FHIR® server data elements.</div></div><div><h3>Results</h3><div>Site investigators mapped 4152 total and 2070 distinct data elements. Only 33.8 % of total CRF data elements (n = 1402) and 27.4 % of distinct data elements (n = 568) were able to be mapped in FHIR® at the four sites. However, the percent of total data elements mapped varied by pediatric study (55.3 %, 30.8 %, and 26.2 %) and study site (46.4 %, 32.3 %, 27.8 %, and 26.6 %). The percent of total CRF data elements mapped was higher in domains containing standard of care data (e.g., Concomitant Medications, Demographics, Diagnosis/Procedures, Medical History, and Vital Signs) and lower in domains containing protocol-specific data (e.g., Adverse Events, Concomitant Medications, Enrollment/Eligibility/Consent, and study treatment-related Labs, and Vital Signs).</div></div><div><h3>Conclusions</h3><div>There is substantial between-study and between-site variability in the percentage of pediatric study data elements available in FHIR® at study sites. These results suggest that mapping solutions for pediatric studies utilizing eSource technologies will need to be site-specific.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106265"},"PeriodicalIF":4.1,"publicationDate":"2026-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A machine learning-driven app for predicting the need for post-operative respiratory support in liver transplant recipients 一个机器学习驱动的应用程序,用于预测肝移植受者术后呼吸支持的需求
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-03 DOI: 10.1016/j.ijmedinf.2026.106263
Ying Wang , Yanan Zhou , Yu Gong , Zhenbin Ding , Liuxiao Yang , Ting Wang

Backgrounds

Liver transplantation (LT) is a life-saving procedure for patients with end-stage liver disease, yet post-operative complications, particularly the need for respiratory support, remain a significant challenge. We aimed to develop and validate a machine learning (ML)-based predictive tool for postoperative respiratory support requirement in liver transplant recipients.

Methods

This single-center retrospective study was conducted at Zhongshan Hospital, Fudan University (Shanghai, China) from January 2018 to October 2023. Following data preprocessing, key variables were selected through univariate analysis, recursive feature elimination (RFE), Chi-square test, and correlation analysis. Nine ML models were initially constructed and optimized via grid search with 5-fold cross-validation. The final model was selected based on area under the curve (AUC), accuracy, sensitivity, specificity, and F1-score, followed by comparative analysis with conventional scoring systems. Model interpretability was achieved using shapley additive explanations (SHAP) analysis, providing both global and local explanations. For clinical implementation, we developed an online application platform for real-time prediction.

Results

The study included 1121 liver transplant recipients, divided into a discovery cohort (n = 749) and validation cohort (n = 372). Significant differences (P < 0.05) were observed between patients requiring versus not requiring respiratory support across multiple preoperative, intraoperative, and postoperative parameters. After hyperparameter optimization, the random forest (RF), stochastic gradient boosting (SGB), and logistic regression (LR) models were applied to the validation cohort, with RF ultimately being selected as the final predictive tool, achieving an AUC of 0.790 (95 % CI: 0.723–0.857) in the test set and 0.713 (95 % CI: 0.658–0.767) in the validation cohort, significantly outperforming both model for end-stage liver disease (MELD) and acute physiology and chronic health evaluation II (APACHE II) scores. SHAP analysis revealed complex bidirectional relationships between predictors and outcomes, with certain variables showing both protective and risk-enhancing effects depending on clinical context.

Conclusions

Based on large-scale clinical data, we developed a robust predictive model that can effectively assess the need for postoperative respiratory support in liver transplant recipients, thereby facilitating clinical decision-making and potentially improving patient outcomes. However, future multi-center validation was warranted to confirm generalizability.
银移植(LT)是终末期肝病患者的救命手术,但术后并发症,特别是呼吸支持的需要,仍然是一个重大挑战。我们旨在开发和验证一种基于机器学习(ML)的预测工具,用于肝移植受者术后呼吸支持需求。方法本研究于2018年1月至2023年10月在中国上海复旦大学中山医院进行。数据预处理后,通过单变量分析、递归特征消除(RFE)、卡方检验和相关分析选择关键变量。通过网格搜索和5倍交叉验证,初步构建了9个ML模型并对其进行了优化。根据曲线下面积(area under curve, AUC)、准确性、敏感性、特异性和f1评分选择最终模型,并与常规评分系统进行对比分析。利用shapley加性解释(SHAP)分析实现了模型的可解释性,提供了全局和局部解释。为了临床实施,我们开发了一个实时预测的在线应用平台。结果本研究纳入1121例肝移植受者,分为发现组(n = 749)和验证组(n = 372)。在术前、术中和术后的多项参数中,需要呼吸支持的患者与不需要呼吸支持的患者之间存在显著差异(P < 0.05)。在超参数优化后,将随机森林(RF)、随机梯度增强(SGB)和逻辑回归(LR)模型应用于验证队列,最终选择RF作为最终预测工具,在测试集中实现0.790 (95% CI: 0.723-0.857)和0.713 (95% CI: 0.713)的AUC。0.658-0.767),显著优于终末期肝病(MELD)模型和急性生理和慢性健康评估II (APACHE II)评分。SHAP分析揭示了预测因素和结果之间复杂的双向关系,根据临床情况,某些变量显示出保护和增加风险的作用。结论基于大量临床数据,我们建立了一个稳健的预测模型,可以有效评估肝移植受者术后呼吸支持的需求,从而促进临床决策,并可能改善患者的预后。然而,未来的多中心验证是必要的,以确认普遍性。
{"title":"A machine learning-driven app for predicting the need for post-operative respiratory support in liver transplant recipients","authors":"Ying Wang ,&nbsp;Yanan Zhou ,&nbsp;Yu Gong ,&nbsp;Zhenbin Ding ,&nbsp;Liuxiao Yang ,&nbsp;Ting Wang","doi":"10.1016/j.ijmedinf.2026.106263","DOIUrl":"10.1016/j.ijmedinf.2026.106263","url":null,"abstract":"<div><h3>Backgrounds</h3><div>Liver transplantation (LT) is a life-saving procedure for patients with end-stage liver disease, yet post-operative complications, particularly the need for respiratory support, remain a significant challenge. We aimed to develop and validate a machine learning (ML)-based predictive tool for postoperative respiratory support requirement in liver transplant recipients.</div></div><div><h3>Methods</h3><div>This single-center retrospective study was conducted at Zhongshan Hospital, Fudan University (Shanghai, China) from January 2018 to October 2023. Following data preprocessing, key variables were selected through univariate analysis, recursive feature elimination (RFE), Chi-square test, and correlation analysis. Nine ML models were initially constructed and optimized via grid search with 5-fold cross-validation. The final model was selected based on area under the curve (AUC), accuracy, sensitivity, specificity, and F1-score, followed by comparative analysis with conventional scoring systems. Model interpretability was achieved using shapley additive explanations (SHAP) analysis, providing both global and local explanations. For clinical implementation, we developed an online application platform for real-time prediction.</div></div><div><h3>Results</h3><div>The study included 1121 liver transplant recipients, divided into a discovery cohort (n = 749) and validation cohort (n = 372). Significant differences (P &lt; 0.05) were observed between patients requiring versus not requiring respiratory support across multiple preoperative, intraoperative, and postoperative parameters. After hyperparameter optimization, the random forest (RF), stochastic gradient boosting (SGB), and logistic regression (LR) models were applied to the validation cohort, with RF ultimately being selected as the final predictive tool, achieving an AUC of 0.790 (95 % CI: 0.723–0.857) in the test set and 0.713 (95 % CI: 0.658–0.767) in the validation cohort, significantly outperforming both model for end-stage liver disease (MELD) and acute physiology and chronic health evaluation II (APACHE II) scores. SHAP analysis revealed complex bidirectional relationships between predictors and outcomes, with certain variables showing both protective and risk-enhancing effects depending on clinical context.</div></div><div><h3>Conclusions</h3><div>Based on large-scale clinical data, we developed a robust predictive model that can effectively assess the need for postoperative respiratory support in liver transplant recipients, thereby facilitating clinical decision-making and potentially improving patient outcomes. However, future multi-center validation was warranted to confirm generalizability.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106263"},"PeriodicalIF":4.1,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advantages and challenges of tracking st-segment elevation myocardial infarction patients with a real-time dashboard: A single-centre experience 用实时仪表板跟踪st段抬高型心肌梗死患者的优势和挑战:单中心体验
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-02 DOI: 10.1016/j.ijmedinf.2026.106261
Egidio de Mattia , Filippo Paoletti , Daniela Pedicino , Giovanna Liuzzo , Carmen Angioletti , Alessia d’Aiello , Alessio Perilli , Andrea Adduci , Giovanni Arcuri , Emilio Meneschincheri , Barbara Ruffo , Melissa D’Agostino , Rita De Donno , Antonio Giulio de Belvis

Background

Timely primary percutaneous coronary intervention (pPCI) is the most important treatment to improve outcomes in ST-segment elevation myocardial infarction (STEMI), with a strong relationship between treatment delays and morbidity and mortality. The present study aims to define the main steps for setting up a real-time digital monitoring dashboard to improve the clinical performance of STEMI management and to evaluate the impact of its implementation on the proportion of patients receiving primary percutaneous coronary intervention (pPCI) within 90 min.

Methods

The set-up of the digital monitoring system required the definition of detailed algorithms for the diagnosis, treatment, and rehab/follow-up phase. For each patient with a diagnosis of STEMI included in the clinical pathway (CP) a multidisciplinary working group identified i) rules for flagging patients alongside the CP, based on specific risk scores; ii) the critical points of the CP to be monitored, such as door-to-balloon time, intensive care unit length of stay, and total hospital length of stay. An interrupted time series analysis and multivariable logistic regression models were performed to assess for changes in the outcome (pPCI within 90 min) after the platform implementation, adjusting for temporal and individual confounders.

Results

After the introduction of the dashboard, the proportion of timely pPCI improved from 40 % pre-implementation to 65 % post-implementation. Adjusted models indicated a twofold increase in the odds of meeting the 90-minute benchmark (OR = 2.00; 95 % CI: 0.99–4.12).

Conclusion

The real-time monitoring system showed a positive impact on the timely management of STEMI, highlighting the potential for improving healthcare efficiency and patient outcomes.
及时的原发性经皮冠状动脉介入治疗(pPCI)是改善st段抬高型心肌梗死(STEMI)预后最重要的治疗方法,治疗延误与发病率和死亡率之间存在密切关系。本研究旨在定义建立实时数字监测仪表板的主要步骤,以提高STEMI管理的临床表现,并评估其实施对90分钟内接受初级经皮冠状动脉介入治疗(pPCI)的患者比例的影响。方法数字监测系统的建立需要明确诊断、治疗和康复/随访阶段的详细算法。对于临床路径(CP)中每个STEMI诊断患者,多学科工作组确定i)根据特定风险评分将患者与CP一起标记的规则;ii)需要监测的CP关键点,如从门到球囊的时间、重症监护病房的住院时间和总住院时间。通过中断时间序列分析和多变量逻辑回归模型来评估平台实施后结果(90分钟内pPCI)的变化,并对时间和个体混杂因素进行调整。结果引入仪表板后,及时pPCI的比例由实施前的40%提高到实施后的65%。调整后的模型显示,达到90分钟基准的几率增加了两倍(OR = 2.00; 95% CI: 0.99-4.12)。结论实时监测系统对STEMI的及时管理有积极的影响,突出了提高医疗效率和患者预后的潜力。
{"title":"Advantages and challenges of tracking st-segment elevation myocardial infarction patients with a real-time dashboard: A single-centre experience","authors":"Egidio de Mattia ,&nbsp;Filippo Paoletti ,&nbsp;Daniela Pedicino ,&nbsp;Giovanna Liuzzo ,&nbsp;Carmen Angioletti ,&nbsp;Alessia d’Aiello ,&nbsp;Alessio Perilli ,&nbsp;Andrea Adduci ,&nbsp;Giovanni Arcuri ,&nbsp;Emilio Meneschincheri ,&nbsp;Barbara Ruffo ,&nbsp;Melissa D’Agostino ,&nbsp;Rita De Donno ,&nbsp;Antonio Giulio de Belvis","doi":"10.1016/j.ijmedinf.2026.106261","DOIUrl":"10.1016/j.ijmedinf.2026.106261","url":null,"abstract":"<div><h3>Background</h3><div>Timely primary percutaneous coronary intervention (pPCI) is the most important treatment to improve outcomes in ST-segment elevation myocardial infarction (STEMI), with a strong relationship between treatment delays and morbidity and mortality. The present study aims to define the main steps for setting up a real-time digital monitoring dashboard to improve the clinical performance of STEMI management and to evaluate the impact of its implementation on the proportion of patients receiving primary percutaneous coronary intervention (pPCI) within 90 min.</div></div><div><h3>Methods</h3><div>The set-up of the digital monitoring system required the definition of detailed algorithms for the diagnosis, treatment, and rehab/follow-up phase. For each patient with a diagnosis of STEMI included in the clinical pathway (CP) a multidisciplinary working group identified i) rules for flagging patients alongside the CP, based on specific risk scores; ii) the critical points of the CP to be monitored, such as door-to-balloon time, intensive care unit length of stay, and total hospital length of stay. An interrupted time series analysis and multivariable logistic regression models were performed to assess for changes in the outcome (pPCI within 90 min) after the platform implementation, adjusting for temporal and individual confounders.</div></div><div><h3>Results</h3><div>After the introduction of the dashboard, the proportion of timely pPCI improved from 40 % pre-implementation to 65 % post-implementation. Adjusted models indicated a twofold increase in the odds of meeting the 90-minute benchmark (OR = 2.00; 95 % CI: 0.99–4.12).</div></div><div><h3>Conclusion</h3><div>The real-time monitoring system showed a positive impact on the timely management of STEMI, highlighting the potential for improving healthcare efficiency and patient outcomes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106261"},"PeriodicalIF":4.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scoping review: how evaluation methods shape our understanding of ChatGPT’s effectiveness in healthcare 范围审查:评估方法如何影响我们对ChatGPT在医疗保健中的有效性的理解
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-31 DOI: 10.1016/j.ijmedinf.2025.106248
Yuanyuan Liu , Yu Zhang, Haoran Mao

Background

The rapid growth in research on ChatGPT’s healthcare applications has led to diverse evaluation methods and substantially heterogeneous findings, undermining evidence reliability and hindering clinical translation.

Objectives

This review aims to examine how different evaluation methods shape our understanding of ChatGPT’s effectiveness in healthcare.

Methods

Studies published between 2023 and 2024 that assess the use of ChatGPT in medical or healthcare-related contexts were included. Evidence was obtained from peer-reviewed literature analyzing ChatGPT’s applications across clinical, educational, and diagnostic domains. Following the PRISMA guidelines, this systematic review analyzed 131 studies published during 2023–2024 that assess the use of ChatGPT in medical contexts.

Results

The results indicate that predominant evaluation approaches—controlled trial studies, expert assessment studies, measurement-based evaluation studies, and prompt generation analysis studies—systematically influence conclusions about ChatGPT’s performance due to their inherent methodological characteristics, such as subjectivity, objectivity, and differences in ecological validity. Further analysis reveals that ChatGPT’s performance is highly context-dependent, shaped by specific application scenarios, model versions, and prompting strategies.

Conclusions

To address methodological heterogeneity and the lack of standardization, this study recommends multi-method cross-validation strategies and a risk-stratified, standardized evaluation framework. These steps are essential to enhance the scientific rigor and reliability of ChatGPT’s assessment in healthcare and to provide a solid foundation for its clinical integration.
ChatGPT在医疗保健应用方面的研究快速增长,导致评估方法多样化,结果极不一致,破坏了证据的可靠性,阻碍了临床转化。目的本综述旨在研究不同的评估方法如何影响我们对ChatGPT在医疗保健中的有效性的理解。方法纳入2023年至2024年间发表的评估ChatGPT在医疗或卫生保健相关背景下使用的研究。证据来自同行评议的文献,分析了ChatGPT在临床、教育和诊断领域的应用。遵循PRISMA指南,本系统综述分析了2023-2024年间发表的131项评估ChatGPT在医学背景下使用的研究。结果表明,主要的评估方法——对照试验研究、专家评估研究、基于测量的评估研究和提示生成分析研究——由于其固有的方法学特征(如主观性、客观性和生态效度差异),系统地影响了关于ChatGPT性能的结论。进一步的分析表明,ChatGPT的性能高度依赖于上下文,由特定的应用程序场景、模型版本和提示策略决定。结论为了解决方法异质性和缺乏标准化的问题,本研究建议采用多方法交叉验证策略和风险分层、标准化的评估框架。这些步骤对于提高ChatGPT在医疗保健领域评估的科学严谨性和可靠性至关重要,并为其临床整合提供坚实的基础。
{"title":"A scoping review: how evaluation methods shape our understanding of ChatGPT’s effectiveness in healthcare","authors":"Yuanyuan Liu ,&nbsp;Yu Zhang,&nbsp;Haoran Mao","doi":"10.1016/j.ijmedinf.2025.106248","DOIUrl":"10.1016/j.ijmedinf.2025.106248","url":null,"abstract":"<div><h3>Background</h3><div>The rapid growth in research on ChatGPT’s healthcare applications has led to diverse evaluation methods and substantially heterogeneous findings, undermining evidence reliability and hindering clinical translation.</div></div><div><h3>Objectives</h3><div>This review aims to examine how different evaluation methods shape our understanding of ChatGPT’s effectiveness in healthcare.</div></div><div><h3>Methods</h3><div>Studies published between 2023 and 2024 that assess the use of ChatGPT in medical or healthcare-related contexts were included. Evidence was obtained from peer-reviewed literature analyzing ChatGPT’s applications across clinical, educational, and diagnostic domains. Following the PRISMA guidelines, this systematic review analyzed 131 studies published during 2023–2024 that assess the use of ChatGPT in medical contexts.</div></div><div><h3>Results</h3><div>The results indicate that predominant evaluation approaches—controlled trial studies, expert assessment studies, measurement-based evaluation studies, and prompt generation analysis studies—systematically influence conclusions about ChatGPT’s performance due to their inherent methodological characteristics, such as subjectivity, objectivity, and differences in ecological validity. Further analysis reveals that ChatGPT’s performance is highly context-dependent, shaped by specific application scenarios, model versions, and prompting strategies.</div></div><div><h3>Conclusions</h3><div>To address methodological heterogeneity and the lack of standardization, this study recommends multi-method cross-validation strategies and a risk-stratified, standardized evaluation framework. These steps are essential to enhance the scientific rigor and reliability of ChatGPT’s assessment in healthcare and to provide a solid foundation for its clinical integration.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106248"},"PeriodicalIF":4.1,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models versus healthcare professionals in providing medical information to patient questions: A systematic review 大型语言模型与医疗保健专业人员在为患者问题提供医疗信息方面的比较:系统回顾。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-31 DOI: 10.1016/j.ijmedinf.2025.106250
Maud M.G. Jacobs , Jacobien H.F. Oosterhoff , Rintje Agricola , Walter van der Weegen

Objective

The rapid expansion of digital healthcare has heightened the volume of patient communication, thereby increasing the workload for healthcare professionals. Large Language Models (LLMs) hold promises for offering automated responses to patient questions relayed through eHealth platforms, yet concerns persist regarding their effectiveness, accuracy, and limitations in healthcare settings. This study aims to evaluate the current evidence on the performance and perceived suitability of LLMs in healthcare, focusing on their role in supporting clinical decision-making and patient communication.

Materials and methods

A systematic search in PubMed and Embase up to June 11, 2025 identified 330 studies, of which 20 met the inclusion criteria for comparing the accuracy and adequacy of medical information provided by LLMs versus healthcare professionals and guidelines. The search strategy combined terms related to LLMs, healthcare professionals, and patient questions. The ROBINS-I tool assessed the risk of bias.

Results

A total of nineteen studies focused on medical specialties and one on the primary care setting. Twelve studies favored the responses generated by LLMs, six reported mixed results, and two favored the healthcare professionals’ response. Bias components generally scored moderate to low, indicating a low risk of bias.

Discussion and conclusions

The review summarizes current evidence on the accuracy and adequacy of medical information provided by LLMs in response to patient questions, compared to healthcare professionals and clinical guidelines. While LLMs show potential as supportive tools in healthcare, their integration should be approached cautiously due to inconsistent performance and possible risks. Further research is essential before widespread adoption.
目的:数字医疗的快速发展增加了患者的沟通量,从而增加了医疗保健专业人员的工作量。大型语言模型(llm)有望为通过电子健康平台转发的患者问题提供自动响应,但在医疗保健环境中,它们的有效性、准确性和局限性仍然令人担忧。本研究旨在评估法学硕士在医疗保健中的表现和感知适应性的现有证据,重点关注他们在支持临床决策和患者沟通方面的作用。材料和方法:截至2025年6月11日,在PubMed和Embase中进行了系统检索,确定了330项研究,其中20项符合比较法学硕士与医疗保健专业人员和指南提供的医学信息的准确性和充分性的纳入标准。搜索策略结合了与法学硕士、医疗保健专业人员和患者问题相关的术语。ROBINS-I工具评估偏倚风险。结果:共有19项研究集中在医学专业,1项研究集中在初级保健环境。12项研究支持法学硕士产生的反应,6项报告混合结果,2项支持医疗保健专业人员的反应。偏倚成分一般得分为中低,表明偏倚风险较低。讨论和结论:本综述总结了与医疗保健专业人员和临床指南相比,法学硕士在回答患者问题时提供的医学信息的准确性和充分性的现有证据。虽然法学硕士在医疗保健领域显示出作为辅助工具的潜力,但由于性能不一致和可能存在的风险,应谨慎对待它们的整合。在广泛采用之前,有必要进行进一步的研究。
{"title":"Large language models versus healthcare professionals in providing medical information to patient questions: A systematic review","authors":"Maud M.G. Jacobs ,&nbsp;Jacobien H.F. Oosterhoff ,&nbsp;Rintje Agricola ,&nbsp;Walter van der Weegen","doi":"10.1016/j.ijmedinf.2025.106250","DOIUrl":"10.1016/j.ijmedinf.2025.106250","url":null,"abstract":"<div><h3>Objective</h3><div>The rapid expansion of digital healthcare has heightened the volume of patient communication, thereby increasing the workload for healthcare professionals. Large Language Models (LLMs) hold promises for offering automated responses to patient questions relayed through eHealth platforms, yet concerns persist regarding their effectiveness, accuracy, and limitations in healthcare settings. This study aims to evaluate the current evidence on the performance and perceived suitability of LLMs in healthcare, focusing on their role in supporting clinical decision-making and patient communication.</div></div><div><h3>Materials and methods</h3><div>A systematic search in PubMed and Embase up to June 11, 2025 identified 330 studies, of which 20 met the inclusion criteria for comparing the accuracy and adequacy of medical information provided by LLMs versus healthcare professionals and guidelines. The search strategy combined terms related to LLMs, healthcare professionals, and patient questions. The ROBINS-I tool assessed the risk of bias.</div></div><div><h3>Results</h3><div>A total of nineteen studies focused on medical specialties and one on the primary care setting. Twelve studies favored the responses generated by LLMs, six reported mixed results, and two favored the healthcare professionals’ response. Bias components generally scored moderate to low, indicating a low risk of bias.</div></div><div><h3>Discussion and conclusions</h3><div>The review summarizes current evidence on the accuracy and adequacy of medical information provided by LLMs in response to patient questions, compared to healthcare professionals and clinical guidelines. While LLMs show potential as supportive tools in healthcare, their integration should be approached cautiously due to inconsistent performance and possible risks. Further research is essential before widespread adoption.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106250"},"PeriodicalIF":4.1,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging large language models to automate the identification of healthcare access barriers for veterans 利用大型语言模型自动识别退伍军人的医疗保健访问障碍。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-30 DOI: 10.1016/j.ijmedinf.2025.106247
Sudarshan Srinivasan , Caitlin Rizy , Maria Mahbub , David Bolme , Alina Peluso , Jodie Trafton , Ioana Danciu

Objective

To develop and evaluate an automated system for identifying healthcare barriers focusing on transportation issues in veterans’ clinical notes using large language models (LLMs) and to assess the impact of different prompting strategies on classification performance and explanation consistency.

Methods

We developed a hybrid system combining pattern matching for templated notes with LLM analysis for free-text notes. Using 2000 manually annotated clinical notes, we compared four prompting strategies (dual-role short, dual-role long, analysis-first, analysis-only) across Mistral-7B and Llama-3.1 models. We evaluated classification performance using standard metrics and assessed explanation consistency through embedding similarity analysis.

Results

The analysis-first strategy achieved superior performance, with Mistral-7B reaching an F1 score of 0.914, outperforming traditional machine learning approaches (GBM: 0.786, BERT: 0.811). LLMs demonstrated higher explanation consistency within models (mean cosine similarity 0.887–0.908) compared to cross-model similarities (0.767–0.872). Pattern matching successfully handled 6.7% of templated notes deterministically. Mistral-7B showed greater internal consistency but higher abstention rates compared to Llama-3.1.

Conclusion

Requiring LLMs to analyze evidence before classification improves both accuracy and explanation consistency for identifying transportation barriers in clinical notes. This approach enables automated barrier detection at scale while providing clinically relevant explanations, supporting both population-level healthcare planning and individual patient care decisions.
目的:利用大语言模型(large language models, LLMs)开发和评估一套自动识别退伍军人临床记录中交通问题医疗障碍的系统,并评估不同提示策略对分类性能和解释一致性的影响。方法:开发了模板笔记模式匹配与自由文本笔记LLM分析相结合的混合系统。使用2000份人工注释的临床记录,我们比较了Mistral-7B和lama-3.1模型的四种提示策略(双角色短、双角色长、分析优先、仅分析)。我们使用标准指标评估分类性能,并通过嵌入相似度分析评估解释一致性。结果:分析优先策略取得了优异的性能,Mistral-7B达到了0.914的F1分数,优于传统的机器学习方法(GBM: 0.786, BERT: 0.811)。与跨模型相似性(0.767-0.872)相比,llm在模型内表现出更高的解释一致性(平均余弦相似性0.887-0.908)。模式匹配成功地确定地处理了6.7%的模板注释。与羊驼-3.1相比,Mistral-7B表现出更大的内部一致性,但更高的弃权率。结论:要求llm在分类前分析证据,提高了临床记录中运输障碍识别的准确性和解释的一致性。这种方法可以实现大规模的自动屏障检测,同时提供临床相关的解释,支持人群层面的医疗保健计划和个体患者护理决策。
{"title":"Leveraging large language models to automate the identification of healthcare access barriers for veterans","authors":"Sudarshan Srinivasan ,&nbsp;Caitlin Rizy ,&nbsp;Maria Mahbub ,&nbsp;David Bolme ,&nbsp;Alina Peluso ,&nbsp;Jodie Trafton ,&nbsp;Ioana Danciu","doi":"10.1016/j.ijmedinf.2025.106247","DOIUrl":"10.1016/j.ijmedinf.2025.106247","url":null,"abstract":"<div><h3>Objective</h3><div>To develop and evaluate an automated system for identifying healthcare barriers focusing on transportation issues in veterans’ clinical notes using large language models (LLMs) and to assess the impact of different prompting strategies on classification performance and explanation consistency.</div></div><div><h3>Methods</h3><div>We developed a hybrid system combining pattern matching for templated notes with LLM analysis for free-text notes. Using 2000 manually annotated clinical notes, we compared four prompting strategies (dual-role short, dual-role long, analysis-first, analysis-only) across Mistral-7B and Llama-3.1 models. We evaluated classification performance using standard metrics and assessed explanation consistency through embedding similarity analysis.</div></div><div><h3>Results</h3><div>The analysis-first strategy achieved superior performance, with Mistral-7B reaching an F1 score of 0.914, outperforming traditional machine learning approaches (GBM: 0.786, BERT: 0.811). LLMs demonstrated higher explanation consistency within models (mean cosine similarity 0.887–0.908) compared to cross-model similarities (0.767–0.872). Pattern matching successfully handled 6.7% of templated notes deterministically. Mistral-7B showed greater internal consistency but higher abstention rates compared to Llama-3.1.</div></div><div><h3>Conclusion</h3><div>Requiring LLMs to analyze evidence before classification improves both accuracy and explanation consistency for identifying transportation barriers in clinical notes. This approach enables automated barrier detection at scale while providing clinically relevant explanations, supporting both population-level healthcare planning and individual patient care decisions.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106247"},"PeriodicalIF":4.1,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Survey on cancer patients’ attitudes towards AI and data protection: A cross-sectional study from an Italian cancer center 癌症患者对人工智能和数据保护的态度调查:来自意大利癌症中心的横断面研究
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-30 DOI: 10.1016/j.ijmedinf.2025.106237
Martina Cavallucci , Alice Andalò , Valentina Danesi , Nicola Gentili , Ilaria Massa , Emanuela Scarpi , Maria Chiara Restuccia , Roberto Vespignani , Alice Conficconi , Michela Palleschi , Ugo De Giorgi , Antonino Musolino , Filippo Merloni
Background
Artificial Intelligence (AI) is increasingly integrated into oncology, offering opportunities to improve diagnostics, treatment planning, and operational efficiency. However, patient perspectives on AI, especially regarding data protection and ethical implications, remain underexplored.
Objective
The objective of this study is to investigate cancer patients’ attitudes toward the use of Artificial Intelligence (AI) in healthcare, focusing on their awareness of data protection, perceived risks and benefits, and the conditions under which AI is considered acceptable. Additionally, the study aims to examine how demographic and educational factors influence patients’ views within the context of an Italian comprehensive cancer center.
Methods
A cross-sectional survey was conducted with 117 cancer patients who completed a 28-item online questionnaire. The survey evaluated levels of AI knowledge, perceptions of data privacy, concerns about AI in medical contexts, and willingness to share health data for research.
Results
Most participants demonstrated moderate awareness of AI (70.1%) and its medical applications (85.5%), with higher familiarity observed among younger and more educated individuals. While data protection understanding varied, 76.9% were willing to share personal health data for research aimed at improving cancer care. Concerns included reduced physician autonomy (52.1%) and diminished physician-patient interaction (63.3%). However, 82.9% of respondents found AI acceptable when clinical decisions remained under physician control. AI was most favorably viewed for administrative support and care process optimization.
Conclusion
Cancer patients generally view AI in healthcare positively, especially when it maintains physician oversight and safeguards data privacy. To ensure equitable and informed adoption, targeted educational initiatives and transparent communication strategies should address generational, educational, and digital literacy differences.
人工智能(AI)越来越多地融入肿瘤学,为改善诊断、治疗计划和运营效率提供了机会。然而,患者对人工智能的看法,特别是在数据保护和伦理影响方面,仍未得到充分探讨。本研究的目的是调查癌症患者对在医疗保健中使用人工智能(AI)的态度,重点关注他们对数据保护的意识、感知的风险和收益,以及在什么情况下人工智能被认为是可接受的。此外,该研究旨在研究人口统计学和教育因素如何影响意大利综合癌症中心内患者的观点。方法对117例癌症患者进行横断面调查。该调查评估了人工智能的知识水平、对数据隐私的看法、对医疗环境中人工智能的担忧以及共享健康数据用于研究的意愿。结果大多数参与者对人工智能(70.1%)及其医学应用(85.5%)表现出中等程度的认识,年轻和受教育程度较高的个体对人工智能的熟悉程度较高。虽然对数据保护的理解各不相同,但76.9%的人愿意分享个人健康数据,用于旨在改善癌症治疗的研究。担忧包括医师自主性降低(52.1%)和医患互动减少(63.3%)。然而,82.9%的受访者认为,当临床决策仍由医生控制时,人工智能是可以接受的。人工智能在行政支持和护理流程优化方面最受欢迎。结论癌症患者普遍对人工智能在医疗保健中的应用持积极态度,尤其是在维护医生监督和保护数据隐私方面。为了确保公平和知情的收养,有针对性的教育举措和透明的沟通战略应解决代际、教育和数字素养的差异。
{"title":"Survey on cancer patients’ attitudes towards AI and data protection: A cross-sectional study from an Italian cancer center","authors":"Martina Cavallucci ,&nbsp;Alice Andalò ,&nbsp;Valentina Danesi ,&nbsp;Nicola Gentili ,&nbsp;Ilaria Massa ,&nbsp;Emanuela Scarpi ,&nbsp;Maria Chiara Restuccia ,&nbsp;Roberto Vespignani ,&nbsp;Alice Conficconi ,&nbsp;Michela Palleschi ,&nbsp;Ugo De Giorgi ,&nbsp;Antonino Musolino ,&nbsp;Filippo Merloni","doi":"10.1016/j.ijmedinf.2025.106237","DOIUrl":"10.1016/j.ijmedinf.2025.106237","url":null,"abstract":"<div><div><strong>Background</strong></div><div>Artificial Intelligence (AI) is increasingly integrated into oncology, offering opportunities to improve diagnostics, treatment planning, and operational efficiency. However, patient perspectives on AI, especially regarding data protection and ethical implications, remain underexplored.</div><div><strong>Objective</strong></div><div>The objective of this study is to investigate cancer patients’ attitudes toward the use of Artificial Intelligence (AI) in healthcare, focusing on their awareness of data protection, perceived risks and benefits, and the conditions under which AI is considered acceptable. Additionally, the study aims to examine how demographic and educational factors influence patients’ views within the context of an Italian comprehensive cancer center.</div><div><strong>Methods</strong></div><div>A cross-sectional survey was conducted with 117 cancer patients who completed a 28-item online questionnaire. The survey evaluated levels of AI knowledge, perceptions of data privacy, concerns about AI in medical contexts, and willingness to share health data for research.</div><div><strong>Results</strong></div><div>Most participants demonstrated moderate awareness of AI (70.1%) and its medical applications (85.5%), with higher familiarity observed among younger and more educated individuals. While data protection understanding varied, 76.9% were willing to share personal health data for research aimed at improving cancer care. Concerns included reduced physician autonomy (52.1%) and diminished physician-patient interaction (63.3%). However, 82.9% of respondents found AI acceptable when clinical decisions remained under physician control. AI was most favorably viewed for administrative support and care process optimization.</div><div><strong>Conclusion</strong></div><div>Cancer patients generally view AI in healthcare positively, especially when it maintains physician oversight and safeguards data privacy. To ensure equitable and informed adoption, targeted educational initiatives and transparent communication strategies should address generational, educational, and digital literacy differences.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106237"},"PeriodicalIF":4.1,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of interpretable machine learning models for dynamic prediction of prognosis in acute pancreatitis complicated by acute kidney injury: A multicenter study 开发和验证可解释的机器学习模型用于动态预测急性胰腺炎合并急性肾损伤的预后:一项多中心研究。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-30 DOI: 10.1016/j.ijmedinf.2025.106260
Xiaoyu Bai , Shuaijing Huang , Shu Huang , Bin Wang , Aijing Zhu , Suxia Qi , Yujia Gao , Hao Zhu , Tingwang Jiang , Bin Zhang , Yadong Feng

Objective

This study aims to develop and validate interpretable machine learning (ML) models to dynamically predict mortality risk among intensive care unit (ICU) patients diagnosed with acute pancreatitis complicated by acute kidney injury (AP-AKI).

Methods

The clinical data in the training set, including demographic characteristics, laboratory indicators, scoring systems, treatment modalities, and clinical management strategies, were obtained from three large-scale medical databases: the Medical Information Mart for Intensive Care, and the eICU Collaborative Research Database. The external validation set consisted of patients recruited from two independent hospitals. Predictive feature selection was conducted using univariate logistic regression, LASSO regularization, and multivariate logistic regression. Eleven machine learning (ML) algorithms—eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Adaptive Boosting (AdaBoost), Decision Tree, Gaussian Naive Bayes (GNB), Multi-Layer Perceptron (MLP), Support Vector Machine (SVM), Bernoulli Naive Bayes (BernoulliNB), Linear Discriminant Analysis, LinearSVC, and Stochastic Gradient Descent (SGD)—were employed to develop predictive models. Finally, the SHapley Additive exPlanations (SHAP) method was applied to interpret the importance and directional effects of individual features.

Results

Dynamic in-hospital mortality prediction was performed at 24 h, 48 h, and 7 days post-ICU admission, identifying nine to twelve variables respectively. The XGBoost model outperformed 10 other machine learning models, achieving training set AUROCs of 0.961 (95 % CI 0.95–0.97), 0.947 (95 % CI 0.94–0.96), and 0.968 (95 % CI 0.96–0.98) at these time points. The corresponding external validation results were 0.871 (95 % CI 0.79–0.95), 0.799 (95 % CI 0.66–0.94), and 0.667 (95 % CI 0.47–0.87). Regarding 90-day post-discharge mortality prediction, six variables were selected. The XGBoost model demonstrated superior performance, with a training set AUROC of 0.966 (95 % CI 0.96–0.97) and an external validation AUROC of 0.745 (95 % CI 0.61–0.88).

Conclusion

Web-based prognostic tools were developed to support clinical decision-making and optimize ICU bed resource management.
目的:本研究旨在开发和验证可解释的机器学习(ML)模型,以动态预测诊断为急性胰腺炎合并急性肾损伤(AP-AKI)的重症监护病房(ICU)患者的死亡风险。方法:训练集的临床数据包括人口学特征、实验室指标、评分体系、治疗方式、临床管理策略等,数据来源于重症监护医学信息集市和eICU协同研究数据库。外部验证集包括从两家独立医院招募的患者。使用单变量逻辑回归、LASSO正则化和多变量逻辑回归进行预测特征选择。采用极端梯度增强(XGBoost)、逻辑回归(LR)、自适应增强(AdaBoost)、决策树、高斯朴素贝叶斯(GNB)、多层感知器(MLP)、支持向量机(SVM)、伯努利朴素贝叶斯(Bernoulli朴素贝叶斯(BernoulliNB)、线性判别分析、线性svc和随机梯度下降(SGD)等11种机器学习(ML)算法建立预测模型。最后,应用SHapley加性解释(SHAP)方法解释个体特征的重要性和方向性效应。结果:对icu入院后24小时、48小时和7天的住院死亡率进行了动态预测,分别确定了9到12个变量。XGBoost模型优于其他10个机器学习模型,在这些时间点上的训练集auroc分别为0.961 (95% CI 0.95-0.97)、0.947 (95% CI 0.94-0.96)和0.968 (95% CI 0.96-0.98)。相应的外部验证结果分别为0.871 (95% CI 0.79 ~ 0.95)、0.799 (95% CI 0.66 ~ 0.94)和0.667 (95% CI 0.47 ~ 0.87)。出院后90天死亡率预测选择6个变量。XGBoost模型表现出优异的性能,其训练集AUROC为0.966 (95% CI 0.96-0.97),外部验证AUROC为0.745 (95% CI 0.61-0.88)。结论:开发了基于网络的预后工具,以支持临床决策并优化ICU床位资源管理。
{"title":"Development and validation of interpretable machine learning models for dynamic prediction of prognosis in acute pancreatitis complicated by acute kidney injury: A multicenter study","authors":"Xiaoyu Bai ,&nbsp;Shuaijing Huang ,&nbsp;Shu Huang ,&nbsp;Bin Wang ,&nbsp;Aijing Zhu ,&nbsp;Suxia Qi ,&nbsp;Yujia Gao ,&nbsp;Hao Zhu ,&nbsp;Tingwang Jiang ,&nbsp;Bin Zhang ,&nbsp;Yadong Feng","doi":"10.1016/j.ijmedinf.2025.106260","DOIUrl":"10.1016/j.ijmedinf.2025.106260","url":null,"abstract":"<div><h3>Objective</h3><div>This study aims to develop and validate interpretable machine learning (ML) models to dynamically predict mortality risk among intensive care unit (ICU) patients diagnosed with acute pancreatitis complicated by acute kidney injury (AP-AKI).</div></div><div><h3>Methods</h3><div>The clinical data in the training set, including demographic characteristics, laboratory indicators, scoring systems, treatment modalities, and clinical management strategies, were obtained from three large-scale medical databases: the Medical Information Mart for Intensive Care, and the eICU Collaborative Research Database. The external validation set consisted of patients recruited from two independent hospitals. Predictive feature selection was conducted using univariate logistic regression, LASSO regularization, and multivariate logistic regression. Eleven machine learning (ML) algorithms—eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Adaptive Boosting (AdaBoost), Decision Tree, Gaussian Naive Bayes (GNB), Multi-Layer Perceptron (MLP), Support Vector Machine (SVM), Bernoulli Naive Bayes (BernoulliNB), Linear Discriminant Analysis, LinearSVC, and Stochastic Gradient Descent (SGD)—were employed to develop predictive models. Finally, the SHapley Additive exPlanations (SHAP) method was applied to interpret the importance and directional effects of individual features.</div></div><div><h3>Results</h3><div>Dynamic in-hospital mortality prediction was performed at 24 h, 48 h, and 7 days post-ICU admission, identifying nine to twelve variables respectively. The XGBoost model outperformed 10 other machine learning models, achieving training set AUROCs of 0.961 (95 % CI 0.95–0.97), 0.947 (95 % CI 0.94–0.96), and 0.968 (95 % CI 0.96–0.98) at these time points. The corresponding external validation results were 0.871 (95 % CI 0.79–0.95), 0.799 (95 % CI 0.66–0.94), and 0.667 (95 % CI 0.47–0.87). Regarding 90-day post-discharge mortality prediction, six variables were selected. The XGBoost model demonstrated superior performance, with a training set AUROC of 0.966 (95 % CI 0.96–0.97) and an external validation AUROC of 0.745 (95 % CI 0.61–0.88).</div></div><div><h3>Conclusion</h3><div>Web-based prognostic tools were developed to support clinical decision-making and optimize ICU bed resource management.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106260"},"PeriodicalIF":4.1,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145901712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Commentary on “Towards practical federated learning and evaluation for medical prediction models” 对“医学预测模型的实用联合学习和评估”的评论。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-28 DOI: 10.1016/j.ijmedinf.2025.106249
Wen-Jiang Yang
{"title":"Commentary on “Towards practical federated learning and evaluation for medical prediction models”","authors":"Wen-Jiang Yang","doi":"10.1016/j.ijmedinf.2025.106249","DOIUrl":"10.1016/j.ijmedinf.2025.106249","url":null,"abstract":"","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"208 ","pages":"Article 106249"},"PeriodicalIF":4.1,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1