首页 > 最新文献

International Journal of Medical Informatics最新文献

英文 中文
Large language models as second reviewers for medical errors in real-world internal medicine reports: a prospective comparative study of open- and closed-source models. 大型语言模型作为现实世界内科报告中医疗差错的第二审稿人:开放和封闭源模型的前瞻性比较研究
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-07 DOI: 10.1016/j.ijmedinf.2026.106316
Roko Skrabic, Ivan Viculin, Zvonimir Boban, Marko Kumric, Marino Vilovic, Josip Vrdoljak, Josko Bozic

Objective: Preventable errors in clinical documentation and decision-making remain a major threat to patient safety, yet the role of open-source large language models (LLMs) as practical "second reviewers" in general Internal Medicine remains unclear.

Methods: We prospectively assembled 102 real-world Emergency Internal Medicine reports (de-identified) and either inserted or confirmed realistic errors across four categories (diagnostics/investigations, medication/therapy, process/communication/follow-up, other). Three LLMs (open-source Deepseek-v3-r1 and GPT-OSS-120b, and closed-source OpenAI-o3) were prompted with a uniform system instruction to (i) localize the predefined error and (ii) recommend corrections. Two blinded Internal Medicine specialists independently graded outputs for error localization (0-1) and recommendation quality (Likert 1-4); disagreements were resolved analytically, and analyses used the more conservative rater. Three human clinicians independently reviewed subsets of the same cases to provide a comparator.

Results: Using the conservative rater, correct error localization was 72.5% (74/102; 95% CI 63.2-80.3) for Deepseek-v3-r1, 79.2% (80/101; 95% CI 70.3-86.0) for o3, and 65.7% (67/102; 95% CI 56.1-74.2) for GPT-OSS-120b (Cochran's Q p = 0.033). Pairwise McNemar tests favored o3 over GPT-OSS-120b (p = 0.020; Holm-adjusted p = 0.060); other contrasts were not significant. Recommendation quality was high for all models (median 4/4), with mean ± SD scores of 3.73 ± 0.49 for Deepseek-v3-r1, 3.65 ± 0.64 for o3, and 3.51 ± 0.73 for GPT-OSS-120b. Inter-rater agreement was excellent for GPT-OSS-120b (κ = 0.94 for detection; κ_w = 0.85 for quality), substantial for Deepseek-v3-r1 (κ = 0.75; κ_w = 0.47), and lower for o3 (κ = 0.31; κ_w = 0.14). All models frequently flagged additional clinically useful issues (≥99% of reports).

Conclusion: In real-world Internal Medicine reports with realistic, expert-defined errors, state-of-the-art open-source LLMs approached the performance of a leading closed model and clearly outperformed clinicians in error detection, while providing predominantly guideline-concordant corrective recommendations. Given their advantages for privacy, customizability, and potential local deployment, open models represent credible candidates for privacy-preserving "second-reviewer" support in Internal Medicine. Prospective, workflow-embedded trials that also quantify specificity on error-free notes, alert burden, and patient outcomes are now warranted.

目的:临床文献和决策中的可预防错误仍然是对患者安全的主要威胁,然而开源大型语言模型(LLMs)作为普通内科实用的“第二审稿人”的作用尚不清楚。方法:我们前瞻性地收集了102份真实世界的急诊内科报告(去识别),并插入或确认了四个类别(诊断/调查、药物/治疗、过程/沟通/随访、其他)中的现实错误。三个llm(开源的Deepseek-v3-r1和GPT-OSS-120b,以及闭源的OpenAI-o3)被一个统一的系统指令提示:(i)定位预定义的错误,(ii)建议纠正。两名盲法内科专家独立对错误定位(0-1)和推荐质量(Likert 1-4)的输出进行评分;分歧是通过分析来解决的,分析使用了更保守的评分。三名人类临床医生独立审查了相同病例的子集,以提供比较。结果:使用保守评分法,Deepseek-v3-r1的正确定位误差为72.5% (74/102;95% CI 63.2-80.3), o3的为79.2% (80/101;95% CI 70.3-86.0), gpt - ss -120b的为65.7% (67/102;95% CI 56.1-74.2) (Cochran’s Q p = 0.033)。成对McNemar检验倾向于o3优于GPT-OSS-120b (p = 0.020;holm校正p = 0.060);其他对比不显著。所有模型的推荐质量都很高(中位数为4/4),Deepseek-v3-r1的平均SD评分为3.73 ± 0.49,o3的平均值为3.65 ± 0.64,gpt - osss -120b的平均值为3.51 ± 0.73。两分的协议是适合gpt oss - 120 b(κ = 0.94 检测;κ_w = 0.85 质量),大量的Deepseek-v3-r1(κ = 0.75;κ_w = 0.47),并降低o3(κ = 0.31;κ_w = 0.14)。所有模型经常标记额外的临床有用问题(≥99%的报告)。结论:在现实世界的内科报告中,有现实的、专家定义的错误,最先进的开源法学硕士接近领先的封闭模型的性能,在错误检测方面明显优于临床医生,同时提供主要的指导方针一致的纠正建议。考虑到它们在隐私、可定制性和潜在的本地部署方面的优势,开放模型代表了内科医学中保护隐私的“第二审稿人”支持的可靠候选者。前瞻性的、嵌入工作流程的试验也量化了无错误记录、警报负担和患者结果的特异性。
{"title":"Large language models as second reviewers for medical errors in real-world internal medicine reports: a prospective comparative study of open- and closed-source models.","authors":"Roko Skrabic, Ivan Viculin, Zvonimir Boban, Marko Kumric, Marino Vilovic, Josip Vrdoljak, Josko Bozic","doi":"10.1016/j.ijmedinf.2026.106316","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106316","url":null,"abstract":"<p><strong>Objective: </strong>Preventable errors in clinical documentation and decision-making remain a major threat to patient safety, yet the role of open-source large language models (LLMs) as practical \"second reviewers\" in general Internal Medicine remains unclear.</p><p><strong>Methods: </strong>We prospectively assembled 102 real-world Emergency Internal Medicine reports (de-identified) and either inserted or confirmed realistic errors across four categories (diagnostics/investigations, medication/therapy, process/communication/follow-up, other). Three LLMs (open-source Deepseek-v3-r1 and GPT-OSS-120b, and closed-source OpenAI-o3) were prompted with a uniform system instruction to (i) localize the predefined error and (ii) recommend corrections. Two blinded Internal Medicine specialists independently graded outputs for error localization (0-1) and recommendation quality (Likert 1-4); disagreements were resolved analytically, and analyses used the more conservative rater. Three human clinicians independently reviewed subsets of the same cases to provide a comparator.</p><p><strong>Results: </strong>Using the conservative rater, correct error localization was 72.5% (74/102; 95% CI 63.2-80.3) for Deepseek-v3-r1, 79.2% (80/101; 95% CI 70.3-86.0) for o3, and 65.7% (67/102; 95% CI 56.1-74.2) for GPT-OSS-120b (Cochran's Q p = 0.033). Pairwise McNemar tests favored o3 over GPT-OSS-120b (p = 0.020; Holm-adjusted p = 0.060); other contrasts were not significant. Recommendation quality was high for all models (median 4/4), with mean ± SD scores of 3.73 ± 0.49 for Deepseek-v3-r1, 3.65 ± 0.64 for o3, and 3.51 ± 0.73 for GPT-OSS-120b. Inter-rater agreement was excellent for GPT-OSS-120b (κ = 0.94 for detection; κ_w = 0.85 for quality), substantial for Deepseek-v3-r1 (κ = 0.75; κ_w = 0.47), and lower for o3 (κ = 0.31; κ_w = 0.14). All models frequently flagged additional clinically useful issues (≥99% of reports).</p><p><strong>Conclusion: </strong>In real-world Internal Medicine reports with realistic, expert-defined errors, state-of-the-art open-source LLMs approached the performance of a leading closed model and clearly outperformed clinicians in error detection, while providing predominantly guideline-concordant corrective recommendations. Given their advantages for privacy, customizability, and potential local deployment, open models represent credible candidates for privacy-preserving \"second-reviewer\" support in Internal Medicine. Prospective, workflow-embedded trials that also quantify specificity on error-free notes, alert burden, and patient outcomes are now warranted.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106316"},"PeriodicalIF":4.1,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital literacy training within interventions to support older adults with cardiovascular disease in using technologies: a systematic review. 在支持患有心血管疾病的老年人使用技术的干预措施中进行数字扫盲培训:系统回顾。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-05 DOI: 10.1016/j.ijmedinf.2026.106312
Kathy L Rush, Cherisse L Seaton, Rowan Ross, Taylor Robertson, Angeliki-Iliana Louloudi, Peter Loewen, Kristen R Haase, Jennifer Jakobi, Robert Janke

Background: Advancement in digitalization in the health sector have created numerous opportunities for cardiovascular disease (CVD) self-management but also challenges, especially for older adults with lower digital health literacy. Reviews have examined impacts of digital health technology interventions on health outcomes without examining the role of training provided. The aim of this review is to synthesize evidence about the impacts of digital literacy training (DLT) and its characteristics as a component of digital interventions related to cardiovascular health on patient reported outcome and experience measures among older adults with CVD.

Methods: In accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, a search of MEDLINE, EMBASE, CINAHL, and PsycINFO databases for articles published between inception to March 31, 2025 was conducted. Empirical studies reporting digital health technology training with adults (M age 60 + years) with CVD were eligible for inclusion. Articles included were quality-rated using the Mixed Methods Appraisal Tool. Data were extracted according to the DLT and health technologies alongside patient-reported outcome (i.e technology- and health-related) and experience measures.

Results: Of the 56 included studies (totaling 7698 participants), DLT varied considerably, with 51 describing in-person training. Two studies (totaling 519 participants) examined the role of training with positive impacts on technology- and health-related outcomes. In many of the remaining studies, positive technology-related outcomes were evident but could not be linked back to DLT separate from the overall intervention. In studies (n = 10) where training was evaluated, feedback from patients largely affirmed the training was needed.

Discussion: The collective evidence suggests DLT overall is useful and needed in digital interventions for older adults with CVD. More work is needed to elucidate the distinct role of DLT characteristics and to determine for whom and under what conditions DLT impacts health and technology-related outcomes.

Registration: The protocol for this review was registered Aug 12, 2024 in Open Science Framework (OSF) (See: https://osf.io/unhd9).

背景:卫生部门数字化的进步为心血管疾病(CVD)的自我管理创造了许多机会,但也带来了挑战,特别是对数字健康素养较低的老年人。审查审查了数字卫生技术干预措施对健康结果的影响,但没有审查所提供培训的作用。本综述的目的是综合有关数字素养培训(DLT)及其特征作为心血管健康相关数字干预的组成部分对老年心血管疾病患者报告的结果和体验测量的影响的证据。方法:根据2020年系统评价和荟萃分析首选报告项目指南,检索MEDLINE、EMBASE、CINAHL和PsycINFO数据库,检索创刊至2025年3月31日期间发表的文章。报告对心血管疾病成人(M年龄60 + 岁)进行数字健康技术培训的实证研究符合纳入条件。使用混合方法评价工具对纳入的文章进行质量评价。根据DLT和卫生技术以及患者报告的结果(即技术和健康相关)和经验措施提取数据。结果:在纳入的56项研究(共计7698名参与者)中,DLT差异很大,其中51项描述了亲自培训。两项研究(共519名参与者)考察了培训对技术和健康相关结果的积极影响。在许多剩余的研究中,与技术相关的积极结果是明显的,但不能与整体干预分开的DLT联系起来。在评估培训的研究中(n = 10),患者的反馈在很大程度上肯定了培训的必要性。讨论:集体证据表明,DLT总体上是有用的,并且需要用于老年心血管疾病患者的数字干预。需要做更多的工作来阐明DLT特征的独特作用,并确定DLT对谁以及在什么条件下影响健康和技术相关成果。注册:本综述的方案于2024年8月12日在开放科学框架(OSF)注册(见:https://osf.io/unhd9)。
{"title":"Digital literacy training within interventions to support older adults with cardiovascular disease in using technologies: a systematic review.","authors":"Kathy L Rush, Cherisse L Seaton, Rowan Ross, Taylor Robertson, Angeliki-Iliana Louloudi, Peter Loewen, Kristen R Haase, Jennifer Jakobi, Robert Janke","doi":"10.1016/j.ijmedinf.2026.106312","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106312","url":null,"abstract":"<p><strong>Background: </strong>Advancement in digitalization in the health sector have created numerous opportunities for cardiovascular disease (CVD) self-management but also challenges, especially for older adults with lower digital health literacy. Reviews have examined impacts of digital health technology interventions on health outcomes without examining the role of training provided. The aim of this review is to synthesize evidence about the impacts of digital literacy training (DLT) and its characteristics as a component of digital interventions related to cardiovascular health on patient reported outcome and experience measures among older adults with CVD.</p><p><strong>Methods: </strong>In accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, a search of MEDLINE, EMBASE, CINAHL, and PsycINFO databases for articles published between inception to March 31, 2025 was conducted. Empirical studies reporting digital health technology training with adults (M age 60 + years) with CVD were eligible for inclusion. Articles included were quality-rated using the Mixed Methods Appraisal Tool. Data were extracted according to the DLT and health technologies alongside patient-reported outcome (i.e technology- and health-related) and experience measures.</p><p><strong>Results: </strong>Of the 56 included studies (totaling 7698 participants), DLT varied considerably, with 51 describing in-person training. Two studies (totaling 519 participants) examined the role of training with positive impacts on technology- and health-related outcomes. In many of the remaining studies, positive technology-related outcomes were evident but could not be linked back to DLT separate from the overall intervention. In studies (n = 10) where training was evaluated, feedback from patients largely affirmed the training was needed.</p><p><strong>Discussion: </strong>The collective evidence suggests DLT overall is useful and needed in digital interventions for older adults with CVD. More work is needed to elucidate the distinct role of DLT characteristics and to determine for whom and under what conditions DLT impacts health and technology-related outcomes.</p><p><strong>Registration: </strong>The protocol for this review was registered Aug 12, 2024 in Open Science Framework (OSF) (See: https://osf.io/unhd9).</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106312"},"PeriodicalIF":4.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of artificial intelligence-assisted pulmonary rehabilitation on exercise capacity: A systematic review and meta-analysis. 人工智能辅助肺康复对运动能力的影响:一项系统综述和荟萃分析。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-05 DOI: 10.1016/j.ijmedinf.2026.106336
Ecran Cinkavuk, Ebru Calik, Naciye Vardar-Yagli

Introduction: Artificial intelligence (AI) technologies are increasingly being integrated into pulmonary rehabilitation (PR) to improve individualization, real-time monitoring, and adherence in individuals with chronic respiratory diseases. However, their clinical impact on exercise capacity remains unclear. This systematic review and meta-analysis aimed to evaluate the effectiveness of AI-supported PR programs compared to usual care in improving exercise capacity and respiratory function in adults with chronic respiratory diseases.

Methods: This systematic review and meta-analysis followed PRISMA guidelines and was registered with PROSPERO (ID: CRD420251075622). A comprehensive search was conducted across five electronic databases (PubMed, Web of Science, Scopus, Cochrane Central Register of Controlled Trials (CENTRAL) and PEDro) from inception to July 2025. Statistical analyses for the meta-analysis were conducted using RevMan 5.4.

Results: Three eligible RCTs with a total of 456 participants were included. Pooled analysis showed a significant improvement in 6-minute walk distance (6MWD) after AI-assisted PR group compared to control (MD: 22.08 m; 95% CI: 4.96-39.20; p = 0.01). Moderate heterogeneity was observed (I2 = 40%). No meta-analysis was conducted for respiratory function due to insufficient pre-post data. Risk of bias was generally low, though participant blinding was absent in all studies. Methodological quality was good, with a mean PEDro score of 6.0 ± 0.82.

Conclusion: AI-supported PR can significantly improve exercise capacity in individuals with chronic respiratory diseases. Despite promising results, high-quality studies in different pulmonary patient groups are needed to address existing limitations, particularly regarding standardization, cost-effectiveness, and clinical integration of AI-technology.

人工智能(AI)技术越来越多地被整合到肺康复(PR)中,以改善慢性呼吸道疾病患者的个体化、实时监测和依从性。然而,它们对运动能力的临床影响尚不清楚。本系统综述和荟萃分析旨在评估人工智能支持的PR项目与常规护理相比,在改善慢性呼吸系统疾病成人的运动能力和呼吸功能方面的有效性。方法:本系统评价和荟萃分析遵循PRISMA指南,并在PROSPERO注册(ID: CRD420251075622)。我们对5个电子数据库(PubMed、Web of Science、Scopus、Cochrane Central Register of Controlled Trials (Central)和PEDro)进行了全面的检索,检索时间从成立到2025年7月。meta分析采用RevMan 5.4进行统计分析。结果:纳入3项符合条件的随机对照试验,共纳入456名受试者。合并分析显示,人工智能辅助PR组患者6分钟步行距离(6MWD)较对照组有显著改善(MD: 22.08 m; 95% CI: 4.96 ~ 39.20; p = 0.01)。观察到中度异质性(I2 = 40%)。由于前后数据不足,未对呼吸功能进行meta分析。偏倚风险一般较低,但所有研究均未采用受试者盲法。方法质量良好,平均PEDro评分为6.0±0.82。结论:人工智能辅助PR可显著提高慢性呼吸系统疾病患者的运动能力。尽管取得了令人鼓舞的结果,但需要对不同肺部患者群体进行高质量的研究,以解决现有的局限性,特别是在标准化、成本效益和人工智能技术的临床整合方面。
{"title":"The effect of artificial intelligence-assisted pulmonary rehabilitation on exercise capacity: A systematic review and meta-analysis.","authors":"Ecran Cinkavuk, Ebru Calik, Naciye Vardar-Yagli","doi":"10.1016/j.ijmedinf.2026.106336","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106336","url":null,"abstract":"<p><strong>Introduction: </strong>Artificial intelligence (AI) technologies are increasingly being integrated into pulmonary rehabilitation (PR) to improve individualization, real-time monitoring, and adherence in individuals with chronic respiratory diseases. However, their clinical impact on exercise capacity remains unclear. This systematic review and meta-analysis aimed to evaluate the effectiveness of AI-supported PR programs compared to usual care in improving exercise capacity and respiratory function in adults with chronic respiratory diseases.</p><p><strong>Methods: </strong>This systematic review and meta-analysis followed PRISMA guidelines and was registered with PROSPERO (ID: CRD420251075622). A comprehensive search was conducted across five electronic databases (PubMed, Web of Science, Scopus, Cochrane Central Register of Controlled Trials (CENTRAL) and PEDro) from inception to July 2025. Statistical analyses for the meta-analysis were conducted using RevMan 5.4.</p><p><strong>Results: </strong>Three eligible RCTs with a total of 456 participants were included. Pooled analysis showed a significant improvement in 6-minute walk distance (6MWD) after AI-assisted PR group compared to control (MD: 22.08 m; 95% CI: 4.96-39.20; p = 0.01). Moderate heterogeneity was observed (I<sup>2</sup> = 40%). No meta-analysis was conducted for respiratory function due to insufficient pre-post data. Risk of bias was generally low, though participant blinding was absent in all studies. Methodological quality was good, with a mean PEDro score of 6.0 ± 0.82.</p><p><strong>Conclusion: </strong>AI-supported PR can significantly improve exercise capacity in individuals with chronic respiratory diseases. Despite promising results, high-quality studies in different pulmonary patient groups are needed to address existing limitations, particularly regarding standardization, cost-effectiveness, and clinical integration of AI-technology.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106336"},"PeriodicalIF":4.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinical-radiological machine learning model for non-invasive diagnosis and stratification of peripheral artery disease: a multicenter study. 外周动脉疾病无创诊断和分层的临床-放射学机器学习模型:一项多中心研究。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-03 DOI: 10.1016/j.ijmedinf.2026.106338
Bowen Hou, Jinhan Qiao, Zheng Ran, Yitong Li, Zhongyichen Huang, Xiaolong Luo, Xiaoming Li

Background and objective: Peripheral artery disease (PAD) is an atherosclerotic disorder prevalent in the elderly that leads to peripheral function decline and body composition changes. Current diagnostic approaches lack sensitivity for early PAD detection and staging. This study aimed to develop and validate machine learning (ML) models of clinical and CT-based radiological features to improve PAD diagnosis and severity stratification.

Methods: A retrospective multicenter study was conducted using data from two institutions. Clinical and radiological features (including volumetric body composition and muscle texture parameters extracted from calf and thigh segments) were analyzed. Participants were randomly divided into training (70%) and test (30%) sets, stratified by PAD status. Models with different ML algorithms were developed and compared. Model interpretability was assessed with Shapley additive explanation (SHAP) analysis, and performance was evaluated through receiver operating characteristic analysis, Hosmer-Lemeshow testing, Brier score and calibration curves.

Results: This study comprised 342 participants, divided into training (n = 176), test set (n = 76) from Institute 1, external validation (n = 90) from Institute 2. Three models were developed: clinical model (CM), radiological model (RM), and combined clinical-radiological model (CRM). The calf-based CRM using random forest algorithm achieved area under the curves of 0.871 (training), 0.870 (test), and 0.828 (validation), demonstrating good calibration (p ≥ 0.05) and the low Brier score. SHAP analysis visually interpreted feature contributions toward PAD diagnosis and staging.

Conclusions: The CRM model effectively integrated calf-derived radiological and clinical features into a noninvasive, interpretable tool for PAD diagnosis and severity stratification, demonstrating strong clinical applicability.

背景与目的:外周动脉疾病(PAD)是一种常见于老年人的动脉粥样硬化性疾病,可导致外周功能下降和身体成分改变。目前的诊断方法缺乏对早期PAD检测和分期的敏感性。本研究旨在开发和验证临床和基于ct的放射学特征的机器学习(ML)模型,以改善PAD的诊断和严重程度分层。方法:采用两所机构的资料进行回顾性多中心研究。分析临床和放射学特征(包括从小腿和大腿段提取的体积体组成和肌肉纹理参数)。参与者随机分为训练组(70%)和测试组(30%),按PAD状态分层。开发了不同ML算法的模型并进行了比较。采用Shapley加性解释(SHAP)分析评价模型的可解释性,通过受试者工作特征分析、Hosmer-Lemeshow检验、Brier评分和校准曲线评价模型的性能。结果:本研究共纳入342名参与者,分为训练组(n = 176),测试组(n = 76)来自研究所1,外部验证组(n = 90)来自研究所2。建立了临床模型(CM)、放射学模型(RM)和临床-放射学联合模型(CRM)。采用随机森林算法的基于小牛的CRM曲线下面积分别为0.871(训练)、0.870(检验)和0.828(验证),具有较好的校正效果(p≥0.05)和较低的Brier评分。SHAP分析直观地解释了对PAD诊断和分期的特征贡献。结论:CRM模型有效地将小牛衍生的影像学和临床特征整合为一种无创、可解释的PAD诊断和严重程度分层工具,具有很强的临床适用性。
{"title":"Clinical-radiological machine learning model for non-invasive diagnosis and stratification of peripheral artery disease: a multicenter study.","authors":"Bowen Hou, Jinhan Qiao, Zheng Ran, Yitong Li, Zhongyichen Huang, Xiaolong Luo, Xiaoming Li","doi":"10.1016/j.ijmedinf.2026.106338","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106338","url":null,"abstract":"<p><strong>Background and objective: </strong>Peripheral artery disease (PAD) is an atherosclerotic disorder prevalent in the elderly that leads to peripheral function decline and body composition changes. Current diagnostic approaches lack sensitivity for early PAD detection and staging. This study aimed to develop and validate machine learning (ML) models of clinical and CT-based radiological features to improve PAD diagnosis and severity stratification.</p><p><strong>Methods: </strong>A retrospective multicenter study was conducted using data from two institutions. Clinical and radiological features (including volumetric body composition and muscle texture parameters extracted from calf and thigh segments) were analyzed. Participants were randomly divided into training (70%) and test (30%) sets, stratified by PAD status. Models with different ML algorithms were developed and compared. Model interpretability was assessed with Shapley additive explanation (SHAP) analysis, and performance was evaluated through receiver operating characteristic analysis, Hosmer-Lemeshow testing, Brier score and calibration curves.</p><p><strong>Results: </strong>This study comprised 342 participants, divided into training (n = 176), test set (n = 76) from Institute 1, external validation (n = 90) from Institute 2. Three models were developed: clinical model (CM), radiological model (RM), and combined clinical-radiological model (CRM). The calf-based CRM using random forest algorithm achieved area under the curves of 0.871 (training), 0.870 (test), and 0.828 (validation), demonstrating good calibration (p ≥ 0.05) and the low Brier score. SHAP analysis visually interpreted feature contributions toward PAD diagnosis and staging.</p><p><strong>Conclusions: </strong>The CRM model effectively integrated calf-derived radiological and clinical features into a noninvasive, interpretable tool for PAD diagnosis and severity stratification, demonstrating strong clinical applicability.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106338"},"PeriodicalIF":4.1,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harmonizing patient-reported outcome measures for nasal complaints using traditional and machine learning methods. 使用传统和机器学习方法协调患者报告的鼻部投诉结果测量。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-02 DOI: 10.1016/j.ijmedinf.2026.106319
Miljan Jović, Esther Hof, Maryam Amir Haeri, Jasper Hoorweg, Stéphanie M van den Berg

Background: Nasal obstruction measurement instruments are widely used in the field of nasal surgery. There are various scales that measure nasal obstruction and they differ regarding the number of items, their wording, and the type of response options. In order to pool the data and analyze it together, it is necessary to harmonize it so that we can compare participants' nasal obstruction scores irrespective of instrument they filled in. Data harmonization is still not used in the field of nasal obstruction assessment.

The aim: The aim of this study was to find the best harmonization method in terms of predicting the scores on a target instrument based on the scores from another instrument as precise as possible in the case of four different nasal complaints instruments. A method was sought to find a transformation of scores on the NOSE, Utrecht-Q and SCHNOS that makes them equivalent to ENFAS scores.

Methods: A total of 1324 unique patients completed all four measurement instruments. We tried linear equating, Item Response Theory (IRT), and the following machine learning methods: linear regression, random forest regression, support vector machine regression, and neural network. We used the root-mean-square error (RMSE) of differences between predicted and observed scores to evaluate the quality of harmonization in 5-fold cross-validation.

Results: The ML methods gave overall the best results (the lowest RMSEs) and outperformed IRT (which is considered as a common choice for data harmonization in psychometrics).

Conclusion: The ML methods led to the best quality of the results, confirming their strong potential for data harmonization. This study shows that next to linear equating and IRT that are commonly used for data harmonization, we can also use ML methods for the same purpose and, by doing so, to even increase the quality of the harmonization in certain use cases.

背景:鼻阻塞测量仪器在鼻外科领域应用广泛。有各种各样的测量鼻塞的量表,它们在项目的数量、措辞和反应选择的类型上有所不同。为了汇集数据并对其进行分析,有必要对其进行协调,以便我们可以比较参与者的鼻塞分数,而不管他们填写的是什么仪器。数据协调仍未应用于鼻塞评估领域。目的:本研究的目的是在四种不同的鼻部抱怨仪器的情况下,根据另一种仪器的分数尽可能精确地预测目标仪器的分数,找到最佳的协调方法。寻求一种方法来找到NOSE, Utrecht-Q和SCHNOS分数的转换,使它们与ENFAS分数相等。方法:1324例特殊患者完成了所有四种测量工具。我们尝试了线性方程、项目反应理论(IRT)和以下机器学习方法:线性回归、随机森林回归、支持向量机回归和神经网络。我们使用预测和观察评分之间差异的均方根误差(RMSE)来评估5倍交叉验证的一致性质量。结果:ML方法总体上给出了最好的结果(最低rmse),并且优于IRT(这被认为是心理测量学中数据协调的常见选择)。结论:机器学习方法的结果质量最好,证实了它们在数据协调方面的强大潜力。这项研究表明,除了通常用于数据协调的线性方程和IRT之外,我们还可以使用ML方法来达到相同的目的,通过这样做,甚至可以在某些用例中提高协调的质量。
{"title":"Harmonizing patient-reported outcome measures for nasal complaints using traditional and machine learning methods.","authors":"Miljan Jović, Esther Hof, Maryam Amir Haeri, Jasper Hoorweg, Stéphanie M van den Berg","doi":"10.1016/j.ijmedinf.2026.106319","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106319","url":null,"abstract":"<p><strong>Background: </strong>Nasal obstruction measurement instruments are widely used in the field of nasal surgery. There are various scales that measure nasal obstruction and they differ regarding the number of items, their wording, and the type of response options. In order to pool the data and analyze it together, it is necessary to harmonize it so that we can compare participants' nasal obstruction scores irrespective of instrument they filled in. Data harmonization is still not used in the field of nasal obstruction assessment.</p><p><strong>The aim: </strong>The aim of this study was to find the best harmonization method in terms of predicting the scores on a target instrument based on the scores from another instrument as precise as possible in the case of four different nasal complaints instruments. A method was sought to find a transformation of scores on the NOSE, Utrecht-Q and SCHNOS that makes them equivalent to ENFAS scores.</p><p><strong>Methods: </strong>A total of 1324 unique patients completed all four measurement instruments. We tried linear equating, Item Response Theory (IRT), and the following machine learning methods: linear regression, random forest regression, support vector machine regression, and neural network. We used the root-mean-square error (RMSE) of differences between predicted and observed scores to evaluate the quality of harmonization in 5-fold cross-validation.</p><p><strong>Results: </strong>The ML methods gave overall the best results (the lowest RMSEs) and outperformed IRT (which is considered as a common choice for data harmonization in psychometrics).</p><p><strong>Conclusion: </strong>The ML methods led to the best quality of the results, confirming their strong potential for data harmonization. This study shows that next to linear equating and IRT that are commonly used for data harmonization, we can also use ML methods for the same purpose and, by doing so, to even increase the quality of the harmonization in certain use cases.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106319"},"PeriodicalIF":4.1,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI in Cardiology Diagnostics: A Systematic Review of Machine Learning, Meta-heuristic Optimization, and Clinical Text Mining for Coronary Artery Disease. 心脏病诊断中可解释的人工智能:冠状动脉疾病机器学习、元启发式优化和临床文本挖掘的系统综述。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-02 DOI: 10.1016/j.ijmedinf.2026.106321
Majdi Jaradat, Mohammed Awad

Background: This systematic review compiles evidence and examines how various artificial intelligence (AI) approaches, including machine learning (ML), natural language processing (NLP), meta-heuristic optimization, and explainable AI (XAI), are utilized to predict and diagnose coronary artery disease (CAD). We aim to identify the most commonly used models, evaluate their performance, and explore how interpretability and optimization enhance their usefulness in clinical practice.

Method: A thorough search was conducted across five major databases (PubMed, Scopus, IEEE Xplore, ACM Digital Library, and SpringerLink) to identify relevant studies published between January 2022 and August 2025, in accordance with the PRISMA guidelines. Dual independent reviewers performed study selection and data extraction. The quality of the included studies was evaluated using a checklist based on QUADAS-2. Data were collected on study characteristics, model types, validation methods, and performance metrics, which will be the cornerstone of the analysis.

Results: Sixty-one studies met the inclusion criteria. ML and deep learning models demonstrated strong performance and achieved high accuracy in benchmark datasets, but showed limited clinical validation. Transformer-based models (e.g., BioBERT, ClinicalBERT) showed high efficacy for medical text analysis, but require substantial data and computational resources. Meta-heuristic algorithms (e.g., Genetic Algorithms, Particle Swarm Optimization) effectively improved model efficiency but were rarely applied to unstructured clinical narratives. XAI tools (e.g., SHAP, LIME) improved model transparency, though most studies highlight a need for more rigorous evaluation.

Conclusion: Integrated ML, NLP, meta-heuristic optimization, and XAI hold significant promise in advancing the diagnosis of CAD by improving both accuracy and interpretability. However, challenges such as data scarcity, limited external validation, and a lack of standardized, clinician-centric explainability impede clinical adoption. Future research should focus on hybrid frameworks validated for large, diverse, and real-world datasets.

背景:本系统综述收集证据并研究了各种人工智能(AI)方法,包括机器学习(ML)、自然语言处理(NLP)、元启发式优化和可解释人工智能(XAI),如何用于预测和诊断冠状动脉疾病(CAD)。我们的目标是确定最常用的模型,评估它们的性能,并探索如何可解释性和优化增强它们在临床实践中的有用性。方法:根据PRISMA指南,在五个主要数据库(PubMed, Scopus, IEEE Xplore, ACM Digital Library和SpringerLink)中进行了彻底的检索,以确定2022年1月至2025年8月期间发表的相关研究。双独立审稿人进行研究选择和数据提取。采用基于QUADAS-2的检查表对纳入研究的质量进行评估。收集有关研究特征、模型类型、验证方法和性能度量的数据,这将是分析的基石。结果:61项研究符合纳入标准。ML和深度学习模型在基准数据集中表现出很强的性能和较高的准确性,但临床验证有限。基于转换器的模型(如BioBERT、ClinicalBERT)在医学文本分析中显示出很高的效率,但需要大量的数据和计算资源。元启发式算法(如遗传算法、粒子群优化)有效地提高了模型效率,但很少应用于非结构化临床叙述。XAI工具(例如,SHAP, LIME)提高了模型的透明度,尽管大多数研究强调需要更严格的评估。结论:整合ML、NLP、元启发式优化和XAI,通过提高准确性和可解释性,在推进CAD诊断方面具有重要的前景。然而,诸如数据稀缺、有限的外部验证以及缺乏标准化、以临床为中心的可解释性等挑战阻碍了临床应用。未来的研究应该集中在大型、多样化和真实世界数据集验证的混合框架上。
{"title":"Explainable AI in Cardiology Diagnostics: A Systematic Review of Machine Learning, Meta-heuristic Optimization, and Clinical Text Mining for Coronary Artery Disease.","authors":"Majdi Jaradat, Mohammed Awad","doi":"10.1016/j.ijmedinf.2026.106321","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106321","url":null,"abstract":"<p><strong>Background: </strong>This systematic review compiles evidence and examines how various artificial intelligence (AI) approaches, including machine learning (ML), natural language processing (NLP), meta-heuristic optimization, and explainable AI (XAI), are utilized to predict and diagnose coronary artery disease (CAD). We aim to identify the most commonly used models, evaluate their performance, and explore how interpretability and optimization enhance their usefulness in clinical practice.</p><p><strong>Method: </strong>A thorough search was conducted across five major databases (PubMed, Scopus, IEEE Xplore, ACM Digital Library, and SpringerLink) to identify relevant studies published between January 2022 and August 2025, in accordance with the PRISMA guidelines. Dual independent reviewers performed study selection and data extraction. The quality of the included studies was evaluated using a checklist based on QUADAS-2. Data were collected on study characteristics, model types, validation methods, and performance metrics, which will be the cornerstone of the analysis.</p><p><strong>Results: </strong>Sixty-one studies met the inclusion criteria. ML and deep learning models demonstrated strong performance and achieved high accuracy in benchmark datasets, but showed limited clinical validation. Transformer-based models (e.g., BioBERT, ClinicalBERT) showed high efficacy for medical text analysis, but require substantial data and computational resources. Meta-heuristic algorithms (e.g., Genetic Algorithms, Particle Swarm Optimization) effectively improved model efficiency but were rarely applied to unstructured clinical narratives. XAI tools (e.g., SHAP, LIME) improved model transparency, though most studies highlight a need for more rigorous evaluation.</p><p><strong>Conclusion: </strong>Integrated ML, NLP, meta-heuristic optimization, and XAI hold significant promise in advancing the diagnosis of CAD by improving both accuracy and interpretability. However, challenges such as data scarcity, limited external validation, and a lack of standardized, clinician-centric explainability impede clinical adoption. Future research should focus on hybrid frameworks validated for large, diverse, and real-world datasets.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106321"},"PeriodicalIF":4.1,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using causal rule mining to identify opportunities for value improvement in regional CABG care: A proof-of-concept study. 使用因果规则挖掘来识别区域CABG护理价值改进的机会:一项概念验证研究。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1016/j.ijmedinf.2026.106317
Sophie van Heuveln, Gijs J van Steenbergen, Mileen R D van de Kar, Erwin S H Tan, Mohamed A Soliman-Hamad Veghel, Rik Eshuis, Lukas R C Dekker, Dennis van Veghel

Objective: To explore the potential of causal rule mining (CRM) as a complementary method to outcome monitoring in identifying plausible causal patterns that may explain undesired clinical outcomes or elevated care consumption in cardiac surgery.

Methods: In this proof-of-concept study, CRM was applied to data from 1,068 patients who underwent elective isolated coronary artery bypass grafting between January 2016 and March 2021 at a single heart center and its referral network in the Netherlands. Outcomes of interest included: 1-year and 120-day mortality, in-hospital stroke, 30-day deep sternal wound infection (DSWI), 30-day re-explorations, 1-year coronary reinterventions, event-free survival, 30-day emergency department (ED) visits, postoperative length of stay, and preoperative fractional flow reserve (FFR) testing. Causal rules were considered relevant if both the odds ratio (OR) and its 95 % confidence interval (CI) were > 1. Identified rules were independently reviewed by clinical experts.

Results: CRM identified 114 significant rules. Five rules were rated as 'new and interesting' and two additional rules were included based on special interest. In follow-up discussions, clinical experts agreed that three rules warrant further clinical investigation: (1) the absence of fractional flow reserve (FFR) testing reducing the likelihood of coronary reintervention, (2) absence of red blood cell (RBC) transfusion during admission reducing the likelihood of 30-day re-explorations, and (3) RBC transfusion increasing the likelihood of 30-day re-explorations.

Conclusion: CRM helped identify potential explanations for certain outcomes and care consumption, providing structured input for hypothesis-driven quality improvement and supporting efforts to enhance patient value.

目的:探讨因果规则挖掘(CRM)作为结果监测的补充方法的潜力,以识别可能解释心脏手术中不良临床结果或护理消耗增加的合理因果模式。方法:在这项概念验证研究中,CRM应用于2016年1月至2021年3月期间在荷兰单一心脏中心及其转诊网络接受选择性孤立冠状动脉旁路移植术的1,068例患者的数据。研究结果包括:1年和120天死亡率、院内卒中、30天深胸骨伤口感染(DSWI)、30天再探查、1年冠状动脉再介入、无事件生存、30天急诊科(ED)就诊、术后住院时间和术前血流储备分数(FFR)测试。如果比值比(OR)及其95% %置信区间(CI)均为 > 1,则认为因果规则相关。确定的规则由临床专家独立审查。结果:CRM识别出114条重要规则。其中5条规则被评为“新颖有趣”,另外2条规则被评为“特殊兴趣”。在后续讨论中,临床专家一致认为有三条规则值得进一步的临床研究:(1)缺乏分数血流储备(FFR)测试降低了冠状动脉再介入的可能性,(2)入院时缺乏红细胞(RBC)输血降低了30天再探查的可能性,(3)红细胞输血增加了30天再探查的可能性。结论:客户关系管理有助于确定某些结果和护理消费的潜在解释,为假设驱动的质量改进提供结构化输入,并支持提高患者价值的努力。
{"title":"Using causal rule mining to identify opportunities for value improvement in regional CABG care: A proof-of-concept study.","authors":"Sophie van Heuveln, Gijs J van Steenbergen, Mileen R D van de Kar, Erwin S H Tan, Mohamed A Soliman-Hamad Veghel, Rik Eshuis, Lukas R C Dekker, Dennis van Veghel","doi":"10.1016/j.ijmedinf.2026.106317","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106317","url":null,"abstract":"<p><strong>Objective: </strong>To explore the potential of causal rule mining (CRM) as a complementary method to outcome monitoring in identifying plausible causal patterns that may explain undesired clinical outcomes or elevated care consumption in cardiac surgery.</p><p><strong>Methods: </strong>In this proof-of-concept study, CRM was applied to data from 1,068 patients who underwent elective isolated coronary artery bypass grafting between January 2016 and March 2021 at a single heart center and its referral network in the Netherlands. Outcomes of interest included: 1-year and 120-day mortality, in-hospital stroke, 30-day deep sternal wound infection (DSWI), 30-day re-explorations, 1-year coronary reinterventions, event-free survival, 30-day emergency department (ED) visits, postoperative length of stay, and preoperative fractional flow reserve (FFR) testing. Causal rules were considered relevant if both the odds ratio (OR) and its 95 % confidence interval (CI) were > 1. Identified rules were independently reviewed by clinical experts.</p><p><strong>Results: </strong>CRM identified 114 significant rules. Five rules were rated as 'new and interesting' and two additional rules were included based on special interest. In follow-up discussions, clinical experts agreed that three rules warrant further clinical investigation: (1) the absence of fractional flow reserve (FFR) testing reducing the likelihood of coronary reintervention, (2) absence of red blood cell (RBC) transfusion during admission reducing the likelihood of 30-day re-explorations, and (3) RBC transfusion increasing the likelihood of 30-day re-explorations.</p><p><strong>Conclusion: </strong>CRM helped identify potential explanations for certain outcomes and care consumption, providing structured input for hypothesis-driven quality improvement and supporting efforts to enhance patient value.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106317"},"PeriodicalIF":4.1,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of an interpretable machine learning model for predicting 4-year chronic kidney disease risk in elderly hypertensive patients. 开发一种可解释的机器学习模型,用于预测老年高血压患者4年慢性肾病风险。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-31 DOI: 10.1016/j.ijmedinf.2026.106320
Panji Wang, Yuan Meng, Zhaowei Sun, Jiaju Li, Hailong Tao

Introduction: Age and hypertension are key drivers of renal impairment, predisposing older hypertensive adults to faster kidney function decline and higher mortality. We aim to develop an interpretable machinelearning model to predict 4-year chronic kidney disease (CKD) risk in this population.

Methods: Our study incorporated 4,142 hypertensive patients from the Health and Retirement Study (HRS) 2010 and 2012 cohorts for model development and internal validation, with additional temporal validation performed within the HRS 2006 and 2008 cohorts. External validation was conducted using three distinct subcohorts derived from the China Health and Retirement Longitudinal Study (CHARLS) database. Feature selection was implemented through an integrated LASSO-Boruta algorithm, followed by model construction using eight machine learning approaches. Discriminative performance was rigorously evaluated through multiple metrics, including receiver operating characteristic (ROC) curve analysis, accuracy, sensitivity, specificity, and Brier score. The optimal model underwent interpretability analysis via SHapley Additive exPlanations (SHAP) to elucidate decision-making mechanisms and was subsequently deployed as a web-based clinical prediction tool.

Results: Using a combined LASSO-Boruta strategy, we identified nine routinely available predictors for model development. In the training set, SVM achieved the highest AUC (0.735), closely followed by XGBoost (0.734); notably, in the temporal validation cohort, XGBoost was the only model with an AUC > 0.700 (0.702). Overall performance metrics derived from confusion matrices, together with Brier scores, suggested that XGBoost provided a favorable balance between sensitivity and specificity while maintaining acceptable probabilistic calibration. Calibration curves further suggested that XGBoost showed relatively stable agreement between predicted and observed risks across datasets, supporting its selection for subsequent SHAP-based interpretation and web deployment; SHAP identified age as the leading contributor to CKD risk.

Conclusions: We developed an interpretable model using routine clinical indicators to predict 4-year CKD risk in elderly hypertensive adults, with applicability across Asian and Caucasian populations.

年龄和高血压是肾脏损害的关键驱动因素,使老年高血压患者肾功能下降更快,死亡率更高。我们的目标是开发一个可解释的机器学习模型来预测这一人群4年慢性肾脏疾病(CKD)的风险。方法:我们的研究纳入了来自健康与退休研究(HRS) 2010年和2012年队列的4142名高血压患者,用于模型开发和内部验证,并在HRS 2006年和2008年队列中进行了额外的时间验证。外部验证使用来自中国健康与退休纵向研究(CHARLS)数据库的三个不同的亚队列进行。通过集成LASSO-Boruta算法实现特征选择,然后使用八种机器学习方法构建模型。通过多种指标,包括受试者工作特征(ROC)曲线分析、准确性、敏感性、特异性和Brier评分,严格评估鉴别效果。通过SHapley加性解释(SHAP)对最佳模型进行可解释性分析,以阐明决策机制,并随后部署为基于网络的临床预测工具。结果:使用联合LASSO-Boruta策略,我们确定了九个常规可用的模型开发预测因子。在训练集中,SVM的AUC最高(0.735),其次是XGBoost (0.734);值得注意的是,在时间验证队列中,XGBoost是唯一AUC为0.700(0.702)的模型。从混淆矩阵得出的总体性能指标,以及Brier评分表明,XGBoost在保持可接受的概率校准的同时,在灵敏度和特异性之间提供了良好的平衡。校准曲线进一步表明,XGBoost在各数据集的预测风险和观测风险之间表现出相对稳定的一致性,为后续基于shap的解释和web部署提供了支持;SHAP将年龄确定为CKD风险的主要因素。结论:我们建立了一个可解释的模型,使用常规临床指标预测老年高血压成人4年CKD风险,适用于亚洲和高加索人群。
{"title":"Development of an interpretable machine learning model for predicting 4-year chronic kidney disease risk in elderly hypertensive patients.","authors":"Panji Wang, Yuan Meng, Zhaowei Sun, Jiaju Li, Hailong Tao","doi":"10.1016/j.ijmedinf.2026.106320","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106320","url":null,"abstract":"<p><strong>Introduction: </strong>Age and hypertension are key drivers of renal impairment, predisposing older hypertensive adults to faster kidney function decline and higher mortality. We aim to develop an interpretable machinelearning model to predict 4-year chronic kidney disease (CKD) risk in this population.</p><p><strong>Methods: </strong>Our study incorporated 4,142 hypertensive patients from the Health and Retirement Study (HRS) 2010 and 2012 cohorts for model development and internal validation, with additional temporal validation performed within the HRS 2006 and 2008 cohorts. External validation was conducted using three distinct subcohorts derived from the China Health and Retirement Longitudinal Study (CHARLS) database. Feature selection was implemented through an integrated LASSO-Boruta algorithm, followed by model construction using eight machine learning approaches. Discriminative performance was rigorously evaluated through multiple metrics, including receiver operating characteristic (ROC) curve analysis, accuracy, sensitivity, specificity, and Brier score. The optimal model underwent interpretability analysis via SHapley Additive exPlanations (SHAP) to elucidate decision-making mechanisms and was subsequently deployed as a web-based clinical prediction tool.</p><p><strong>Results: </strong>Using a combined LASSO-Boruta strategy, we identified nine routinely available predictors for model development. In the training set, SVM achieved the highest AUC (0.735), closely followed by XGBoost (0.734); notably, in the temporal validation cohort, XGBoost was the only model with an AUC > 0.700 (0.702). Overall performance metrics derived from confusion matrices, together with Brier scores, suggested that XGBoost provided a favorable balance between sensitivity and specificity while maintaining acceptable probabilistic calibration. Calibration curves further suggested that XGBoost showed relatively stable agreement between predicted and observed risks across datasets, supporting its selection for subsequent SHAP-based interpretation and web deployment; SHAP identified age as the leading contributor to CKD risk.</p><p><strong>Conclusions: </strong>We developed an interpretable model using routine clinical indicators to predict 4-year CKD risk in elderly hypertensive adults, with applicability across Asian and Caucasian populations.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106320"},"PeriodicalIF":4.1,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A personalized and complex mHealth intervention for the universal prevention of Perinatal mental Disorders in routine maternal Care: Design and development of e-Perinatal app. 个性化和复杂的移动健康干预在常规产妇护理中普遍预防围产期精神障碍:电子围产期应用程序的设计和开发
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-31 DOI: 10.1016/j.ijmedinf.2026.106290
Company-Córdoba Rosalba, Caffieri Alessia, Barquero-Jimenez Carlos, Cruz-Cabrera Roberto, De-Juan-Iglesias Paula, Gil-Cosano José J, Goossens Lennert, Nieto-Casado Francisco J, Ureña-Lorenzo Amalia, Gómez-Gómez Irene, Motrico Emma

Background: Perinatal Mental Disorders (PMDs) are common during pregnancy and the first postpartum year, with negative consequences for women, their partners, and infants, as well as broader societal costs. While numerous interventions have been developed to prevent PMDs, there remains a need for a universal, personalized, and cost-effective solution integrated into routine maternal care. The e-Perinatal study aimed to address this gap. This paper describes the design of the e-Perinatal intervention, delivered via a dedicated mobile health app.

Methods: Guided by the Medical Research Council framework, the e-Perinatal app integrates Self-Determination Theory, Normalization Process Theory, and Patient and Public Involvement perspectives. Existing evidence was reviewed, and stakeholders participated in the co-development of digital micro-interventions (DMs). A clinical rule-based algorithm was implemented to generate personalized recommendations across four pathways (1) weekly content delivery, (2) user preferences, (3) individual risk profile, and (4) PMD monitoring.

Results: The e-Perinatal app includes: 1) DMs focused on psychological, physical activity, and healthy lifestyle domains; 2) a personalized recommendation engine; 3) a social support section; 4) mental health monitoring; 5) an 'SOS' button for assistance; and 6) an appointment reminder tool. In total, 332 evidence-based DMs were developed for women and their partners and delivered in text, audio, and video formats. A clinical rule-based algorithm tailors recommendations according to user characteristics and perinatal stage, employing adaptive content filtering to optimize personalization.

Conclusion: the e-Perinatal app is a personalized mHealth intervention toprevent PMDs within routine maternal care. The intervention combines evidence-based strategies, personalized recommendations, and adaptive digital content to prevent PMDs. Future research will assess effectiveness, implementation, and real-world impact of e-Perinatal intervention for PMD prevention.

背景:围产期精神障碍(PMDs)在怀孕期间和产后第一年很常见,对妇女、其伴侣和婴儿产生负面影响,并带来更广泛的社会成本。虽然已经制定了许多预防经pmd的干预措施,但仍需要将一种普遍、个性化和具有成本效益的解决方案纳入常规孕产妇保健。e-围产期研究旨在解决这一差距。本文描述了电子围产期干预的设计,通过一个专用的移动健康应用程序提供。方法:在医学研究委员会框架的指导下,电子围产期应用程序整合了自决理论、规范化过程理论以及患者和公众参与的观点。审查了现有证据,利益相关者参与了数字微干预(DMs)的共同开发。实施了一种基于临床规则的算法,通过四个途径生成个性化建议(1)每周内容交付,(2)用户偏好,(3)个人风险概况,(4)PMD监测。结果:电子围产期app包括:1)关注心理、身体活动和健康生活方式领域的dm;2)个性化推荐引擎;3)社会支持科;4)心理健康监测;5)求救“SOS”按钮;6)预约提醒工具。总共为妇女及其伴侣开发了332份基于证据的指导文件,并以文本、音频和视频形式提供。一种基于临床规则的算法根据用户特征和围产期阶段量身定制推荐,采用自适应内容过滤优化个性化。结论:电子围产期应用程序是一种个性化的移动健康干预措施,可在常规孕产妇保健中预防经前综合症。干预措施结合了循证策略、个性化建议和自适应数字内容,以预防pmd。未来的研究将评估电子围产期干预对PMD预防的有效性、实施和现实世界的影响。
{"title":"A personalized and complex mHealth intervention for the universal prevention of Perinatal mental Disorders in routine maternal Care: Design and development of e-Perinatal app.","authors":"Company-Córdoba Rosalba, Caffieri Alessia, Barquero-Jimenez Carlos, Cruz-Cabrera Roberto, De-Juan-Iglesias Paula, Gil-Cosano José J, Goossens Lennert, Nieto-Casado Francisco J, Ureña-Lorenzo Amalia, Gómez-Gómez Irene, Motrico Emma","doi":"10.1016/j.ijmedinf.2026.106290","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106290","url":null,"abstract":"<p><strong>Background: </strong>Perinatal Mental Disorders (PMDs) are common during pregnancy and the first postpartum year, with negative consequences for women, their partners, and infants, as well as broader societal costs. While numerous interventions have been developed to prevent PMDs, there remains a need for a universal, personalized, and cost-effective solution integrated into routine maternal care. The e-Perinatal study aimed to address this gap. This paper describes the design of the e-Perinatal intervention, delivered via a dedicated mobile health app.</p><p><strong>Methods: </strong>Guided by the Medical Research Council framework, the e-Perinatal app integrates Self-Determination Theory, Normalization Process Theory, and Patient and Public Involvement perspectives. Existing evidence was reviewed, and stakeholders participated in the co-development of digital micro-interventions (DMs). A clinical rule-based algorithm was implemented to generate personalized recommendations across four pathways (1) weekly content delivery, (2) user preferences, (3) individual risk profile, and (4) PMD monitoring.</p><p><strong>Results: </strong>The e-Perinatal app includes: 1) DMs focused on psychological, physical activity, and healthy lifestyle domains; 2) a personalized recommendation engine; 3) a social support section; 4) mental health monitoring; 5) an 'SOS' button for assistance; and 6) an appointment reminder tool. In total, 332 evidence-based DMs were developed for women and their partners and delivered in text, audio, and video formats. A clinical rule-based algorithm tailors recommendations according to user characteristics and perinatal stage, employing adaptive content filtering to optimize personalization.</p><p><strong>Conclusion: </strong>the e-Perinatal app is a personalized mHealth intervention toprevent PMDs within routine maternal care. The intervention combines evidence-based strategies, personalized recommendations, and adaptive digital content to prevent PMDs. Future research will assess effectiveness, implementation, and real-world impact of e-Perinatal intervention for PMD prevention.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106290"},"PeriodicalIF":4.1,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigating illness in a virtual world: the role of immersive technology across chronic care continuum - A scoping review. 在虚拟世界中导航疾病:沉浸式技术在慢性护理连续体中的作用-范围审查。
IF 4.1 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-30 DOI: 10.1016/j.ijmedinf.2026.106311
Ramakrishna Dantu, Mohammad Murad, Kriti Sharma, Kirti Dutta, Laura Cravens-Ray

Immersive technologies offer promising capabilities for chronic disease management, but their implementation and specific applications across the chronic care continuum remain limited. This study examines how immersive technologies are being utilized across various chronic disease contexts through a scoping review. Using a comprehensive mapping of literature published between 1995 and 2024, we identified 2,012 relevant articles from major databases using WHO and CDC-defined chronic disease keywords and finally focused on 127 studies for detailed manual review. Our approach combined text analytics (BERTopic modelling) with manual synthesis. This methodology revealed eight key themes where immersive technologies are being applied: medical procedures, training and education for healthcare professionals, substance use disorder therapy, cognitive rehabilitation, physical rehabilitation, exergaming and biofeedback, navigation and spatial therapy, and pain, stress, and anxiety management. These themes reflect the growing use of immersive technologies to support diverse activities in chronic care settings. The findings highlight the breadth of immersive technology applications across multiple points in chronic care. Our study introduces a thematic framework for understanding immersive applications in healthcare and identifies research directions and opportunities for future investigation. Future research should explore long-term integration into clinical workflows, as well as inclusivity and adoption across diverse populations.

沉浸式技术为慢性疾病管理提供了很有前景的能力,但它们在慢性护理连续体中的实施和具体应用仍然有限。本研究考察了沉浸式技术如何通过范围审查在各种慢性疾病背景下被利用。通过对1995年至2024年间发表的文献进行综合制图,我们使用WHO和cdc定义的慢性病关键词从主要数据库中确定了2012篇相关文章,并最终将重点放在127项研究中进行详细的人工审查。我们的方法结合了文本分析(BERTopic建模)和人工合成。该方法揭示了沉浸式技术应用的八个关键主题:医疗程序、医疗保健专业人员的培训和教育、物质使用障碍治疗、认知康复、身体康复、运动和生物反馈、导航和空间治疗、疼痛、压力和焦虑管理。这些主题反映了越来越多地使用沉浸式技术来支持慢性病护理环境中的各种活动。研究结果强调了沉浸式技术在慢性护理中跨多点应用的广度。我们的研究引入了一个理解沉浸式医疗应用的主题框架,并确定了未来调查的研究方向和机会。未来的研究应该探索长期整合到临床工作流程中,以及在不同人群中的包容性和采用。
{"title":"Navigating illness in a virtual world: the role of immersive technology across chronic care continuum - A scoping review.","authors":"Ramakrishna Dantu, Mohammad Murad, Kriti Sharma, Kirti Dutta, Laura Cravens-Ray","doi":"10.1016/j.ijmedinf.2026.106311","DOIUrl":"https://doi.org/10.1016/j.ijmedinf.2026.106311","url":null,"abstract":"<p><p>Immersive technologies offer promising capabilities for chronic disease management, but their implementation and specific applications across the chronic care continuum remain limited. This study examines how immersive technologies are being utilized across various chronic disease contexts through a scoping review. Using a comprehensive mapping of literature published between 1995 and 2024, we identified 2,012 relevant articles from major databases using WHO and CDC-defined chronic disease keywords and finally focused on 127 studies for detailed manual review. Our approach combined text analytics (BERTopic modelling) with manual synthesis. This methodology revealed eight key themes where immersive technologies are being applied: medical procedures, training and education for healthcare professionals, substance use disorder therapy, cognitive rehabilitation, physical rehabilitation, exergaming and biofeedback, navigation and spatial therapy, and pain, stress, and anxiety management. These themes reflect the growing use of immersive technologies to support diverse activities in chronic care settings. The findings highlight the breadth of immersive technology applications across multiple points in chronic care. Our study introduces a thematic framework for understanding immersive applications in healthcare and identifies research directions and opportunities for future investigation. Future research should explore long-term integration into clinical workflows, as well as inclusivity and adoption across diverse populations.</p>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"211 ","pages":"106311"},"PeriodicalIF":4.1,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1