Pub Date : 2026-01-14DOI: 10.1016/j.ijmedinf.2026.106291
Zhihao Lei
{"title":"“Calibration or contamination?” Reassessing the evaluation of large language models for clinical mortality prediction","authors":"Zhihao Lei","doi":"10.1016/j.ijmedinf.2026.106291","DOIUrl":"10.1016/j.ijmedinf.2026.106291","url":null,"abstract":"","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106291"},"PeriodicalIF":4.1,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.ijmedinf.2026.106277
Manuri De Silva , Alice Voskoboynik , Sailavan Ramesh , Janice Campbell , Saravanan Satkumaran , Daryl R. Cheng
Objective
Communicable diseases, especially seasonal respiratory illnesses, contribute significantly to paediatric hospital presentations and admissions. Existing surveillance systems often require retrospective manual data collation and focus on either demographic or clinical data, not both. The Communicable Diseases Platform (CDP) is a dynamic data platform that aggregates both data types for all communicable disease presentations to The Royal Children’s Hospital Melbourne (RCH).
Methods
In the pilot phase, the CDP extracted de-identified aggregated data from hospital electronic medical records for patients with positive respiratory swabs. A dashboard displayed positivity rate and cumulative hospital admissions trends from 2016 to 2025, further filterable by pathogen, age, presentation type and interventions.
Discussion
The CDP improves understanding of clinical profiles, disease burden and seasonal patterns, supporting better outbreak control, patient flow prediction and clinical surveillance. Future developments include immunisation data integration and machine learning algorithm evaluation for real-time vaccine effectiveness estimations and communicable disease predictive modelling.
{"title":"Communicable diseases platform (CDP): Real-Time clinical analytics for infections","authors":"Manuri De Silva , Alice Voskoboynik , Sailavan Ramesh , Janice Campbell , Saravanan Satkumaran , Daryl R. Cheng","doi":"10.1016/j.ijmedinf.2026.106277","DOIUrl":"10.1016/j.ijmedinf.2026.106277","url":null,"abstract":"<div><h3>Objective</h3><div>Communicable diseases, especially seasonal respiratory illnesses, contribute significantly to paediatric hospital presentations and admissions. Existing surveillance systems often require retrospective manual data collation and focus on either demographic or clinical data, not both. The Communicable Diseases Platform (CDP) is a dynamic data platform that aggregates both data types for all communicable disease presentations to The Royal Children’s Hospital Melbourne (RCH).</div></div><div><h3>Methods</h3><div>In the pilot phase, the CDP extracted de-identified aggregated data from hospital electronic medical records for patients with positive respiratory swabs. A dashboard displayed positivity rate and cumulative hospital admissions trends from 2016 to 2025, further filterable by pathogen, age, presentation type and interventions.</div></div><div><h3>Discussion</h3><div>The CDP improves understanding of clinical profiles, disease burden and seasonal patterns, supporting better outbreak control, patient flow prediction and clinical surveillance. Future developments include immunisation data integration and machine learning algorithm evaluation for real-time vaccine effectiveness estimations and communicable disease predictive modelling.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106277"},"PeriodicalIF":4.1,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.ijmedinf.2026.106275
Wenyong Wang , Mahnaz Samadbeik , Gaurav Puri , Donald S.A. McLeod , Elton Lobo , Tuan Duong , Titus Kirwa , Clair Sullivan
Background
Electronic Medical Records (EMRs) aim to improve efficiency, safety, and quality of care. However, the impact of EMR implementation, particularly in outpatient diabetes care, remains underexplored. This study explored clinicians’ perspectives on EMR use in diabetes outpatient care.
Methods
This qualitative study, conducted in line with COREQ guidelines, involved four focus groups with 22 clinicians (doctors, nurses, and allied health) at a metropolitan diabetes service in Queensland, Australia. Data were analysed using deductive content analysis, guided by the Quintuple Aim and Technology Acceptance Model/Unified Theory of Acceptance and Use of Technology frameworks.
Results
Clinicians reported mixed outcomes across the Quintuple Aim domains, shaped by technology adoption constructs. Facilitators such as improved efficiency, access to patient information, and prescribing safety reflected perceived usefulness and positive attitudes, contributing to favourable outcomes across multiple Quintuple Aim. Barriers such as navigation complexity, technical issues, alert fatigue, and overwhelming training led to negative outcomes in EMR use. Tensions around documentation practices and patient expectations of system use, resulted in mixed outcomes. Overall, clinicians viewed EMRs as essential, but sustained adoption required improved usability, tailored training, and better system integration.
Conclusion
This study concludes that while the EMRs improved safety, efficiency, and access to information, their design and implementation also introduced burdens that negatively affected clinician experience. EMRs significantly shape the healthcare workforce, influencing workflow, wellbeing, and professional engagement. In outpatient diabetes care, specific workflow challenges such as glycaemic data integration highlight that existing EMR designs may not fully support the complexity of chronic disease management. To maximise benefits, EMR initiatives should be approached as quality improvement activities, with role-specific training, reliable infrastructure, and clinician involvement in system optimisation. Future research should address usability challenges, enhance integration, and ensure that both clinician and patient perspectives guide digital health transformation.
{"title":"Clinicians’ perspectives on electronic medical records use in diabetes outpatient Care: A qualitative study","authors":"Wenyong Wang , Mahnaz Samadbeik , Gaurav Puri , Donald S.A. McLeod , Elton Lobo , Tuan Duong , Titus Kirwa , Clair Sullivan","doi":"10.1016/j.ijmedinf.2026.106275","DOIUrl":"10.1016/j.ijmedinf.2026.106275","url":null,"abstract":"<div><h3>Background</h3><div>Electronic Medical Records (EMRs) aim to improve efficiency, safety, and quality of care. However, the impact of EMR implementation, particularly in outpatient diabetes care, remains underexplored. This study explored clinicians’ perspectives on EMR use in diabetes outpatient care.</div></div><div><h3>Methods</h3><div>This qualitative study, conducted in line with COREQ guidelines, involved four focus groups with 22 clinicians (doctors, nurses, and allied health) at a metropolitan diabetes service in Queensland, Australia. Data were analysed using deductive content analysis, guided by the Quintuple Aim and Technology Acceptance Model/Unified Theory of Acceptance and Use of Technology frameworks.</div></div><div><h3>Results</h3><div>Clinicians reported mixed outcomes across the Quintuple Aim domains, shaped by technology adoption constructs. Facilitators such as improved efficiency, access to patient information, and prescribing safety reflected perceived usefulness and positive attitudes, contributing to favourable outcomes across multiple Quintuple Aim. Barriers such as navigation complexity, technical issues, alert fatigue, and overwhelming training led to negative outcomes in EMR use. Tensions around documentation practices and patient expectations of system use, resulted in mixed outcomes<strong>.</strong> Overall, clinicians viewed EMRs as essential, but sustained adoption required improved usability, tailored training, and better system integration.</div></div><div><h3>Conclusion</h3><div>This study concludes that while the EMRs improved safety, efficiency, and access to information, their design and implementation also introduced burdens that negatively affected clinician experience. EMRs significantly shape the healthcare workforce, influencing workflow, wellbeing, and professional engagement. In outpatient diabetes care, specific workflow challenges such as glycaemic data integration highlight that existing EMR designs may not fully support the complexity of chronic disease management. To maximise benefits, EMR initiatives should be approached as quality improvement activities, with role-specific training, reliable infrastructure, and clinician involvement in system optimisation. Future research should address usability challenges, enhance integration, and ensure that both clinician and patient perspectives guide digital health transformation.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106275"},"PeriodicalIF":4.1,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145960768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.ijmedinf.2026.106271
Zhihong Han , Baixin Li , Jie Liu
Background
Aortic dissection (AD) is a critical cardiovascular disorder with substantial risks of short-term mortality. Some researchers have endeavored to utilize machine learning (ML) approaches to develop predictive models for the risk of mortality in AD. However, systematic evidence about the accuracy of these models remains scarce, which poses challenges to the development and enhancement of risk assessment tools. Therefore, this study seeks to systematically review the reliability of ML in forecasting the risk of mortality in AD.
Methods
A search was implemented through PubMed, Cochrane, Embase, and Web of Science up to September 11, 2025. The prediction model risk of bias (RoB) assessment tool (PROBAST) was leveraged to estimate the RoB of the included studies. Subgroup analyses were implemented based upon types of AD and time of death.
Results
In total, 35 studies were included, covering 19,838 patients with AD. The results showed that, within the training datasets, ML models demonstrated a sensitivity (SEN) of 0.75 (95% CI: 0.72–0.78) and specificity (SPE) of 0.77 (95% CI: 0.74–0.80) for predicting mortality in AD. Within the validation set, which mainly focused on TAAD, the SEN was 0.79 (95% CI: 0.74–0.84) and the SPE was 0.78 (95% CI: 0.68–0.85). For in-hospital mortality, the SEN was 0.78 (95% CI: 0.72–0.83) and the SPE was 0.77 (95% CI: 0.65–0.86); for out-of-hospital mortality, the SEN and SPE were 0.81–0.84 and 0.74–0.86.
Conclusion
ML models demonstrate remarkable accuracy in forecasting the risk of mortality in AD and show superior performance relative to existing scoring systems to some extent. Future research should incorporate more multi-center, multi-ethnic, and geographically varied cases to develop a more broadly applicable risk prediction tool and offer insights for the tailored prevention strategies.
{"title":"Predictive value of machine learning for mortality risk in aortic dissection: a systematic review and meta-analysis","authors":"Zhihong Han , Baixin Li , Jie Liu","doi":"10.1016/j.ijmedinf.2026.106271","DOIUrl":"10.1016/j.ijmedinf.2026.106271","url":null,"abstract":"<div><h3>Background</h3><div>Aortic dissection (AD) is a critical cardiovascular disorder with substantial risks of short-term mortality. Some researchers have endeavored to utilize machine learning (ML) approaches to develop predictive models for the risk of mortality in AD. However, systematic evidence about the accuracy of these models remains scarce, which poses challenges to the development and enhancement of risk assessment tools. Therefore, this study seeks to systematically review the reliability of ML in forecasting the risk of mortality in AD.</div></div><div><h3>Methods</h3><div>A search was implemented through PubMed, Cochrane, Embase, and Web of Science up to September 11, 2025. The prediction model risk of bias (RoB) assessment tool (PROBAST) was leveraged to estimate the RoB of the included studies. Subgroup analyses were implemented based upon types of AD and time of death.</div></div><div><h3>Results</h3><div>In total, 35 studies were included, covering 19,838 patients with AD. The results showed that, within the training datasets, ML models demonstrated a sensitivity (SEN) of 0.75 (95% CI: 0.72–0.78) and specificity (SPE) of 0.77 (95% CI: 0.74–0.80) for predicting mortality in AD. Within the validation set, which mainly focused on TAAD, the SEN was 0.79 (95% CI: 0.74–0.84) and the SPE was 0.78 (95% CI: 0.68–0.85). For in-hospital mortality, the SEN was 0.78 (95% CI: 0.72–0.83) and the SPE was 0.77 (95% CI: 0.65–0.86); for out-of-hospital mortality, the SEN and SPE were 0.81–0.84 and 0.74–0.86.</div></div><div><h3>Conclusion</h3><div>ML models demonstrate remarkable accuracy in forecasting the risk of mortality in AD and show superior performance relative to existing scoring systems to some extent. Future research should incorporate more multi-center, multi-ethnic, and geographically varied cases to develop a more broadly applicable risk prediction tool and offer insights for the tailored prevention strategies.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106271"},"PeriodicalIF":4.1,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.ijmedinf.2026.106276
Xizhi Wu , Madeline S. Kreider , Philip E. Empey , Chenyu Li , Yanshan Wang
Objective
Fluoropyrimidines are widely prescribed for colorectal and breast cancers, but are associated with toxicities such as hand-foot syndrome and cardiotoxicity. Since toxicity documentation is often embedded in clinical notes, we aimed to develop and evaluate natural language processing (NLP) methods to extract treatment and toxicity information.
Materials and methods
We constructed a gold-standard dataset of 236 clinical notes from 204,165 adult oncology patients. Domain experts annotated categories related to treatment regimens and toxicities. We developed rule-based, machine learning-based (Random Forest [RF], Support Vector Machine [SVM], Logistic Regression [LR]), deep learning-based (BERT, ClinicalBERT), and large language models (LLM)-based NLP approaches (zero-shot and error analysis prompting). A 5-fold cross validation were conducted to validate each model.
Results
Error analysis prompting achieved optimal precision, recall, and F1 scores for treatment (F1 = 1.000) and toxicities extraction (F1 = 0.965), whereas zero-shot perform moderately (treatment F1 = 0.889, toxicities extraction F1 = 0.854) Rule-based reached F1 = 1.000 for treatment and F1 = 0.904 for toxicities extraction. LR and SVM ranked second and fourth for toxicities extraction (LR F1 = 0.914, SVM F1 = 0.903). Deep learning and RF underperformed, with performance of BERT reached F1 = 0.792 for treatment and F1 = 0.837 for toxicities extraction.,ClinicalBERT reached F1 = 0.797 for treatment and F1 = 0.884 for toxicities extraction). RF reached F1 = 0.745 for treatment and F1 = 0.853 for toxicities extraction.
Discussion
LMM-based error analysis outperformed all others, followed by machine learning methods. Machine learning and deep learning methods were limited by small training data and showed limited generalizability, particularly for rare categories.
Conclusion
LLM-based error analysis most effectively extracted fluoropyrimidine treatment and toxicity information from clinical notes, and has strong potential to support oncology research and pharmacovigilance.
{"title":"Automated extraction of fluoropyrimidine treatment and treatment-related toxicities from clinical notes using natural language processing","authors":"Xizhi Wu , Madeline S. Kreider , Philip E. Empey , Chenyu Li , Yanshan Wang","doi":"10.1016/j.ijmedinf.2026.106276","DOIUrl":"10.1016/j.ijmedinf.2026.106276","url":null,"abstract":"<div><h3>Objective</h3><div>Fluoropyrimidines are widely prescribed for colorectal and breast cancers, but are associated with toxicities such as hand-foot syndrome and cardiotoxicity. Since toxicity documentation is often embedded in clinical notes, we aimed to develop and evaluate natural language processing (NLP) methods to extract treatment and toxicity information.</div></div><div><h3>Materials and methods</h3><div>We constructed a gold-standard dataset of 236 clinical notes from 204,165 adult oncology patients. Domain experts annotated categories related to treatment regimens and toxicities. We developed rule-based, machine learning-based (Random Forest [RF], Support Vector Machine [SVM], Logistic Regression [LR]), deep learning-based (BERT, ClinicalBERT), and large language models (LLM)-based NLP approaches (zero-shot and error analysis prompting). A 5-fold cross validation were conducted to validate each model.</div></div><div><h3>Results</h3><div>Error analysis prompting achieved optimal precision, recall, and F1 scores for treatment (F1 = 1.000) and toxicities extraction (F1 = 0.965), whereas zero-shot perform moderately (treatment F1 = 0.889, toxicities extraction F1 = 0.854) Rule-based reached F1 = 1.000 for treatment and F1 = 0.904 for toxicities extraction. LR and SVM ranked second and fourth for toxicities extraction (LR F1 = 0.914, SVM F1 = 0.903). Deep learning and RF underperformed, with performance of BERT reached F1 = 0.792 for treatment and F1 = 0.837 for toxicities extraction.,ClinicalBERT reached F1 = 0.797 for treatment and F1 = 0.884 for toxicities extraction). RF reached F1 = 0.745 for treatment and F1 = 0.853 for toxicities extraction.</div></div><div><h3>Discussion</h3><div>LMM-based error analysis outperformed all others, followed by machine learning methods. Machine learning and deep learning methods were limited by small training data and showed limited generalizability, particularly for rare categories.</div></div><div><h3>Conclusion</h3><div>LLM-based error analysis most effectively extracted fluoropyrimidine treatment and toxicity information from clinical notes, and has strong potential to support oncology research and pharmacovigilance.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106276"},"PeriodicalIF":4.1,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.ijmedinf.2025.106246
Ahmet Ugur Atilan, Niyazi Cetin
Objective
Large Language Models (LLMs) are increasingly applied to patient education, yet their performance in languages that are relatively underrepresented in medical-domain corpora and large language model training datasets remains underexplored. Psoriasis and psoriatic arthritis (PsA) are chronic, immune-mediated diseases requiring lifelong patient engagement, making them suitable conditions to evaluate the clarity, reliability, and inclusivity of AI-generated educational content. To assess the comprehensibility, scientific reliability, and patient-centered communication of Turkish patient education materials for psoriasis vulgaris and PsA generated by seven state-of-the-art LLMs.
Methods
A cross-sectional analysis compared outputs from ChatGPT-4o, Gemini 2.0 Flash, Claude 3.7 Sonnet, Grok 3, Qwen 2.5, DeepSeek R1, and Mistral Large 2. Brochures were produced using standardized zero-shot prompts and evaluated via the Ateşman readability index and the DISCERN instrument. Overall differences in DISCERN scores across the seven models were assessed using a Friedman test, followed by Bonferroni-adjusted Wilcoxon signed-rank post-hoc analyses.
Results
Readability scores ranged from 61.6 to 80.2 (mean = 71.3 ± 6.9), with ChatGPT-4o and Qwen 2.5 generating the most accessible texts. DISCERN reliability scores ranged from 38.5 to 60.5, with Claude 3.7 Sonnet and Gemini 2.0 Flash showing the highest accuracy. Models prioritizing factual precision produced denser language, while conversational models favored fluency but sacrificed depth. Notable variation was observed, with only Claude 3.7 Sonnet and Gemini 2.0 Flash consistently reflecting patient-centered perspectives.
Conclusion
LLMs showed observable differences in balancing clarity and reliability when generating health education leaflets in Turkish. Most outputs appeared to lack explicit psychosocial framing and emphasis on shared decision-making, which may suggest the need for more culturally adaptive training, clinician oversight, and locally grounded validation frameworks to support safe and inclusive AI-based patient education.
{"title":"An old disease, a new linguistic challenge for large language models: patient education on psoriasis and psoriatic arthritis in an underrepresented medical language","authors":"Ahmet Ugur Atilan, Niyazi Cetin","doi":"10.1016/j.ijmedinf.2025.106246","DOIUrl":"10.1016/j.ijmedinf.2025.106246","url":null,"abstract":"<div><h3>Objective</h3><div>Large Language Models (LLMs) are increasingly applied to patient education, yet their performance in languages that are relatively underrepresented in medical-domain corpora and large language model training datasets remains underexplored. Psoriasis and psoriatic arthritis (PsA) are chronic, immune-mediated diseases requiring lifelong patient engagement, making them suitable conditions to evaluate the clarity, reliability, and inclusivity of AI-generated educational content. To assess the comprehensibility, scientific reliability, and patient-centered communication of Turkish patient education materials for psoriasis vulgaris and PsA generated by seven state-of-the-art LLMs.</div></div><div><h3>Methods</h3><div>A cross-sectional analysis compared outputs from ChatGPT-4o, Gemini 2.0 Flash, Claude 3.7 Sonnet, Grok 3, Qwen 2.5, DeepSeek R1, and Mistral Large 2. Brochures were produced using standardized zero-shot prompts and evaluated via the Ateşman readability index and the DISCERN instrument. Overall differences in DISCERN scores across the seven models were assessed using a Friedman test, followed by Bonferroni-adjusted Wilcoxon signed-rank post-hoc analyses.</div></div><div><h3>Results</h3><div>Readability scores ranged from 61.6 to 80.2 (mean = 71.3 ± 6.9), with ChatGPT-4o and Qwen 2.5 generating the most accessible texts. DISCERN reliability scores ranged from 38.5 to 60.5, with Claude 3.7 Sonnet and Gemini 2.0 Flash showing the highest accuracy. Models prioritizing factual precision produced denser language, while conversational models favored fluency but sacrificed depth. Notable variation was observed, with only Claude 3.7 Sonnet and Gemini 2.0 Flash consistently reflecting patient-centered perspectives.</div></div><div><h3>Conclusion</h3><div>LLMs showed observable differences in balancing clarity and reliability when generating health education leaflets in Turkish. Most outputs appeared to lack explicit psychosocial framing and emphasis on shared decision-making, which may suggest the need for more culturally adaptive training, clinician oversight, and locally grounded validation frameworks to support safe and inclusive AI-based patient education.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106246"},"PeriodicalIF":4.1,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.ijmedinf.2026.106262
Moid Sandhu , Siddique Latif , Andrew Bayor , Wei Lu , Mahnoosh Kholghi , Deepa Prabhu , David Silvera-Tawil
Objective: This paper critically reviews existing work in sensor-based emotional dysregulation monitoring to support caregivers of individuals diagnosed with autism spectrum disorder (ASD).
Methods: A systematic literature search was conducted across six databases (Google Scholar, IEEE Xplore, Scopus, ACM Digital Library, Web of Science, and PubMed) covering publications from January 1, 2016, to September 30, 2025.
Results: Thirty-two studies met inclusion criteria, comprising 27 focused on sensor-based emotional dysregulation detection and 5 addressing intervention or support mechanisms. These studies suggest that sensor-based technologies have potential for continuous physiological monitoring, facilitating early detection and intervention to support emotional dysregulation episodes. Critical deficiencies were identified in real-time alerting capabilities, autonomous intervention deployment, self-regulation framework integration, system reliability, long-term sustainability, user interface design, and cross-environment scalability.
Conclusion: There is a significant need to develop real-time emotion monitoring systems to empower caregivers in delivering timely, targeted interventions for individuals diagnosed with ASD. Future research should prioritise the development of real-time alert systems, autonomous intervention protocols, and solutions optimised for reliability, sustainability, usability, and adaptability across heterogeneous care settings.
目的:本文综述了基于传感器的情绪失调监测的现有工作,以支持自闭症谱系障碍(ASD)患者的护理人员。方法:对6个数据库(b谷歌Scholar、IEEE Xplore、Scopus、ACM Digital Library、Web of Science和PubMed)进行系统文献检索,检索时间为2016年1月1日至2025年9月30日。结果:32项研究符合纳入标准,其中27项关注基于传感器的情绪失调检测,5项关注干预或支持机制。这些研究表明,基于传感器的技术具有持续生理监测的潜力,有助于早期发现和干预,以支持情绪失调发作。在实时警报能力、自主干预部署、自我调节框架集成、系统可靠性、长期可持续性、用户界面设计和跨环境可扩展性方面发现了关键缺陷。结论:迫切需要开发实时情绪监测系统,使护理人员能够为ASD患者提供及时、有针对性的干预措施。未来的研究应优先发展实时警报系统、自主干预协议和解决方案,以优化可靠性、可持续性、可用性和跨异构护理环境的适应性。
{"title":"Empowering caregivers of individuals with autism spectrum disorder through sensor-based monitoring of emotional dysregulation: A scoping review","authors":"Moid Sandhu , Siddique Latif , Andrew Bayor , Wei Lu , Mahnoosh Kholghi , Deepa Prabhu , David Silvera-Tawil","doi":"10.1016/j.ijmedinf.2026.106262","DOIUrl":"10.1016/j.ijmedinf.2026.106262","url":null,"abstract":"<div><div><em>Objective:</em> This paper critically reviews existing work in sensor-based emotional dysregulation monitoring to support caregivers of individuals diagnosed with autism spectrum disorder (ASD).</div><div><em>Methods:</em> A systematic literature search was conducted across six databases (Google Scholar, IEEE Xplore, Scopus, ACM Digital Library, Web of Science, and PubMed) covering publications from January 1, 2016, to September 30, 2025.</div><div><em>Results:</em> Thirty-two studies met inclusion criteria, comprising 27 focused on sensor-based emotional dysregulation detection and 5 addressing intervention or support mechanisms. These studies suggest that sensor-based technologies have potential for continuous physiological monitoring, facilitating early detection and intervention to support emotional dysregulation episodes. Critical deficiencies were identified in real-time alerting capabilities, autonomous intervention deployment, self-regulation framework integration, system reliability, long-term sustainability, user interface design, and cross-environment scalability.</div><div><em>Conclusion:</em> There is a significant need to develop real-time emotion monitoring systems to empower caregivers in delivering timely, targeted interventions for individuals diagnosed with ASD. Future research should prioritise the development of real-time alert systems, autonomous intervention protocols, and solutions optimised for reliability, sustainability, usability, and adaptability across heterogeneous care settings.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106262"},"PeriodicalIF":4.1,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.ijmedinf.2026.106264
Yang Gao, Yingjie Lu, Xiaofei Li
{"title":"From promise to practice: strengthening evidence for AI conversational agents in healthcare","authors":"Yang Gao, Yingjie Lu, Xiaofei Li","doi":"10.1016/j.ijmedinf.2026.106264","DOIUrl":"10.1016/j.ijmedinf.2026.106264","url":null,"abstract":"","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106264"},"PeriodicalIF":4.1,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.ijmedinf.2026.106274
Hang Chen , Wenchao Dai , Jun Yang , Xin Dang , Li Jiang
Purpose
Small cell lung cancer (SCLC) is a highly aggressive malignancy with a high incidence of liver metastases, particularly among elderly patients, which significantly worsens survival outcomes. However, efficient predictive tools targeting this population remain scarce. This study aimed to develop and validate an interpretable machine learning-based model to re-stratify the risk of liver metastasis in elderly patients with SCLC after completion of routine staging evaluation at initial diagnosis.
Methods
A total of 10,080 patients aged ≥60 years with histologically confirmed SCLC were included from the SEER database (2010–2017) and the Affiliated Hospital of North Sichuan Medical College, China (2010–2024). Patients from SEER were randomly assigned to a training set (n = 7719) and an internal validation set (n = 1930), while 431 patients from China comprised the external validation set. Feature selection was performed using the Boruta algorithm, identifying 11 key variables. Seven ML models, namely, Logistic Regression, Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, XGBoost, and LightGBM, were developed to compare their predictive performance. The optimal model was further interpreted using SHAP (SHapley Additive exPlanations).
Results
The incidence of liver metastasis was approximately 32.89%, 35.39%, and 32.71% in the training, internal validation, and external validation sets, respectively. Comparative analysis across models demonstrated that, in the internal validation set, XGBoost achieved the best overall discriminative performance, with an AUC of 0.820, slightly outperforming LightGBM (0.819), logistic regression (0.813), and random forest (0.811). In the external validation set, the performance of all models declined. Given its relatively superior predictive performance, XGBoost was selected as the final model for interpretability analyses. SHAP analysis indicated that LDS/EDS, tumor stage, bone metastasis, and brain metastasis were the most influential features contributing to the model predictions.
Conclusion
The XGBoost-based model exhibited moderate predictive value and satisfactory interpretability in assessing the risk of liver metastasis in patients with SCLC, suggesting its potential utility as an adjunctive decision-support tool following initial diagnostic staging. Nevertheless, its generalizability across different populations requires further validation, and localized recalibration may be necessary prior to broader clinical implementation.
{"title":"Interpretable machine learning-based prediction of liver metastasis risk in elderly patients with small cell lung Cancer: A study based on the SEER database and external validation in a Chinese cohort","authors":"Hang Chen , Wenchao Dai , Jun Yang , Xin Dang , Li Jiang","doi":"10.1016/j.ijmedinf.2026.106274","DOIUrl":"10.1016/j.ijmedinf.2026.106274","url":null,"abstract":"<div><h3>Purpose</h3><div>Small cell lung cancer (SCLC) is a highly aggressive malignancy with a high incidence of liver metastases, particularly among elderly patients, which significantly worsens survival outcomes. However, efficient predictive tools targeting this population remain scarce. This study aimed to develop and validate an interpretable machine learning-based model to re-stratify the risk of liver metastasis in elderly patients with SCLC after completion of routine staging evaluation at initial diagnosis.</div></div><div><h3>Methods</h3><div>A total of 10,080 patients aged ≥60 years with histologically confirmed SCLC were included from the SEER database (2010–2017) and the Affiliated Hospital of North Sichuan Medical College, China (2010–2024). Patients from SEER were randomly assigned to a training set (n = 7719) and an internal validation set (n = 1930), while 431 patients from China comprised the external validation set. Feature selection was performed using the Boruta algorithm, identifying 11 key variables. Seven ML models, namely, Logistic Regression, Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, XGBoost, and LightGBM, were developed to compare their predictive performance. The optimal model was further interpreted using SHAP (SHapley Additive exPlanations).</div></div><div><h3>Results</h3><div>The incidence of liver metastasis was approximately 32.89%, 35.39%, and 32.71% in the training, internal validation, and external validation sets, respectively. Comparative analysis across models demonstrated that, in the internal validation set, XGBoost achieved the best overall discriminative performance, with an AUC of 0.820, slightly outperforming LightGBM (0.819), logistic regression (0.813), and random forest (0.811). In the external validation set, the performance of all models declined. Given its relatively superior predictive performance, XGBoost was selected as the final model for interpretability analyses. SHAP analysis indicated that LDS/EDS, tumor stage, bone metastasis, and brain metastasis were the most influential features contributing to the model predictions.</div></div><div><h3>Conclusion</h3><div>The XGBoost-based model exhibited moderate predictive value and satisfactory interpretability in assessing the risk of liver metastasis in patients with SCLC, suggesting its potential utility as an adjunctive decision-support tool following initial diagnostic staging. Nevertheless, its generalizability across different populations requires further validation, and localized recalibration may be necessary prior to broader clinical implementation.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"209 ","pages":"Article 106274"},"PeriodicalIF":4.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}