JAMIA Open最新文献_第9页

Clinical and economic impact of digital dashboards on hospital inpatient care: a systematic review. 数字指示板对医院住院病人护理的临床和经济影响：系统回顾。

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-07-26 eCollection Date: 2025-08-01 DOI: 10.1093/jamiaopen/ooaf078

Enrico Coiera, Anastasia Chan, Kalissa Brooke-Cowden, Hania Rahimi-Ardabili, Nicole Halim, Catalin Tufanaru

Objective: Digital dashboards are used to monitor patients and improve inpatient outcomes in hospital settings. A systematic review assessed the impact of dashboards across five outcomes of hospital mortality, hospital length of stay (LOS), economic impacts, harms, and patient and carer satisfaction.

Materials and methods: Nine databases were searched from inception to May 2024. Studies were included if they reported primary quantitative research on dashboard interventions in hospital settings, were in English, and measured effectiveness for patients, caregivers, healthcare professionals or services. Data synthesis was performed via narrative review. Risk of bias was measured using Cochrane ROBINS-I and RoB 2.

Results: We identified 5755 articles, and 70 met inclusion criteria. Of 20 findings reporting mortality (16 studies), five reported a decrease, whilst the majority (n = 15) found no significant change. LOS was reported across 43 findings (31 studies), with 28 reporting a reduction, an increase in five, and ten reporting no change. Of 21 findings (from 16 studies) reporting on harms, increases were observed in six, decreases in four, and no change in 11. Economic impacts were reported in 34 findings (31 studies), with the majority demonstrating reduced costs (n = 29), an increase in one, and no change in four. Eight findings (eight studies) reported on patient and carer satisfaction with care, with the majority (n = 6) demonstrating increased satisfaction, and two reporting no change.

Discussion: Hospital dashboards do appear associated with either no change or a reduction in mortality, reduced costs, reduced LOS, and improved patient and caregiver satisfaction with care. Association with harms was equivocal.

Conclusion: While there is evidence of potential benefits, actual impacts of hospital digital dashboard will likely be dependent on multiple local factors such as workflow integration.

目的：数字仪表板用于监测患者和改善住院患者的结果在医院设置。一项系统综述评估了仪表板对医院死亡率、住院时间（LOS）、经济影响、危害以及患者和护理人员满意度等五项结果的影响。材料与方法：检索自成立至2024年5月的9个数据库。如果研究报告了医院环境中仪表板干预措施的初步定量研究，并以英语进行，并且测量了患者，护理人员，医疗保健专业人员或服务的有效性，则纳入研究。通过叙述性回顾进行数据综合。偏倚风险采用Cochrane ROBINS-I和rob2进行测量。结果：共纳入5755篇文章，其中70篇符合纳入标准。在报告死亡率的20项发现（16项研究）中，5项报告了死亡率的下降，而大多数（n = 15）没有发现显著变化。43项研究（31项研究）报告了LOS，其中28项报告减少了LOS， 5项报告增加了LOS， 10项报告没有变化。在报告危害的21项发现（来自16项研究）中，有6项发现危害增加，4项发现危害减少，11项发现危害没有变化。34项发现（31项研究）报告了经济影响，其中大多数表明成本降低（n = 29），一项增加，四项没有变化。八项发现（八项研究）报告了患者和护理人员对护理的满意度，其中大多数（n = 6）表明满意度增加，两项报告没有变化。讨论：医院仪表板确实与死亡率不变或降低、降低成本、降低LOS以及提高患者和护理人员对护理的满意度有关。与危害的关联是模棱两可的。结论：虽然有证据表明数字仪表板有潜在的好处，但医院数字仪表板的实际影响可能取决于多个本地因素，如工作流集成。

{"title":"Clinical and economic impact of digital dashboards on hospital inpatient care: a systematic review.","authors":"Enrico Coiera, Anastasia Chan, Kalissa Brooke-Cowden, Hania Rahimi-Ardabili, Nicole Halim, Catalin Tufanaru","doi":"10.1093/jamiaopen/ooaf078","DOIUrl":"10.1093/jamiaopen/ooaf078","url":null,"abstract":"Objective: Digital dashboards are used to monitor patients and improve inpatient outcomes in hospital settings. A systematic review assessed the impact of dashboards across five outcomes of hospital mortality, hospital length of stay (LOS), economic impacts, harms, and patient and carer satisfaction.Materials and methods: Nine databases were searched from inception to May 2024. Studies were included if they reported primary quantitative research on dashboard interventions in hospital settings, were in English, and measured effectiveness for patients, caregivers, healthcare professionals or services. Data synthesis was performed via narrative review. Risk of bias was measured using Cochrane ROBINS-I and RoB 2.Results: We identified 5755 articles, and 70 met inclusion criteria. Of 20 findings reporting mortality (16 studies), five reported a decrease, whilst the majority (n = 15) found no significant change. LOS was reported across 43 findings (31 studies), with 28 reporting a reduction, an increase in five, and ten reporting no change. Of 21 findings (from 16 studies) reporting on harms, increases were observed in six, decreases in four, and no change in 11. Economic impacts were reported in 34 findings (31 studies), with the majority demonstrating reduced costs (n = 29), an increase in one, and no change in four. Eight findings (eight studies) reported on patient and carer satisfaction with care, with the majority (n = 6) demonstrating increased satisfaction, and two reporting no change.Discussion: Hospital dashboards do appear associated with either no change or a reduction in mortality, reduced costs, reduced LOS, and improved patient and caregiver satisfaction with care. Association with harms was equivocal.Conclusion: While there is evidence of potential benefits, actual impacts of hospital digital dashboard will likely be dependent on multiple local factors such as workflow integration.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 4","pages":"ooaf078"},"PeriodicalIF":3.4,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12296400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144733688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SurgeryLSTM: a time-aware neural model for accurate and explainable length of stay prediction after spine surgery. 一个时间感知神经模型，用于脊柱手术后准确和可解释的住院时间预测。

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-07-25 eCollection Date: 2025-08-01 DOI: 10.1093/jamiaopen/ooaf079

Ha Na Cho, Sairam Sutari, Alexander Lopez, Hansen Bow, Kai Zheng

Objective: To develop and evaluate machine learning (ML) models for predicting length of stay (LOS) in elective spine surgery, with a focus on the benefits of temporal modeling and model interpretability.

Materials and methods: We compared traditional ML models (eg, Linear Regression, Random Forest, Support Vector Machine [SVM], and XGBoost) with our developed model, SurgeryLSTM, a masked bidirectional long short-term memory (BiLSTM) with an attention, using structured perioperative electronic health records (EHR) data. Performance was evaluated using the coefficient of determination (R ²), and key predictors were identified using explainable AI.

Results: SurgeryLSTM achieved the highest predictive accuracy (R ² = 0.86), outperforming XGBoost (R ² = 0.85) and baseline models. The attention mechanism improved interpretability by dynamically identifying influential temporal segments within preoperative clinical sequences, allowing clinicians to trace which events or features most contributed to each LOS prediction. Key predictors of LOS included bone disorder, chronic kidney disease, and lumbar fusion identified as the most impactful predictors of LOS.

Discussion: Temporal modeling with attention mechanisms significantly improves LOS prediction by capturing the sequential nature of patient data. Unlike static models, SurgeryLSTM provides both higher accuracy and greater interpretability, which are critical for clinical adoption. These results highlight the potential of integrating attention-based temporal models into hospital planning workflows.

Conclusion: SurgeryLSTM presents an effective and interpretable AI solution for LOS prediction in elective spine surgery. Our findings support the integration of temporal, explainable ML approaches into clinical decision support systems to enhance discharge readiness and individualized patient care.

目的：开发和评估用于预测择期脊柱手术住院时间（LOS）的机器学习（ML）模型，重点关注时间建模和模型可解释性的好处。材料和方法：我们比较了传统的机器学习模型（如线性回归、随机森林、支持向量机[SVM]和XGBoost）和我们开发的模型，使用结构化的围手术期电子健康记录（EHR）数据，一个具有注意力的掩蔽双向长短期记忆（BiLSTM）。使用决定系数（r2）评估绩效，并使用可解释的AI确定关键预测因子。结果：surgylstm获得了最高的预测准确率（r2 = 0.86），优于XGBoost （r2 = 0.85）和基线模型。注意机制通过动态识别术前临床序列中有影响的时间段，提高了可解释性，使临床医生能够追踪哪些事件或特征对每次LOS预测贡献最大。LOS的主要预测因素包括骨紊乱、慢性肾脏疾病和腰椎融合，这些因素被认为是LOS最重要的预测因素。讨论：具有注意力机制的时间建模通过捕获患者数据的顺序特性显著改善了LOS预测。与静态模型不同，surgylstm提供了更高的准确性和更大的可解释性，这对临床应用至关重要。这些结果突出了将基于注意力的时间模型集成到医院规划工作流程中的潜力。结论：在选择性脊柱手术中，surgical stm为LOS预测提供了一种有效且可解释的人工智能解决方案。我们的研究结果支持将时间，可解释的ML方法整合到临床决策支持系统中，以增强出院准备和个性化患者护理。

{"title":"SurgeryLSTM: a time-aware neural model for accurate and explainable length of stay prediction after spine surgery.","authors":"Ha Na Cho, Sairam Sutari, Alexander Lopez, Hansen Bow, Kai Zheng","doi":"10.1093/jamiaopen/ooaf079","DOIUrl":"10.1093/jamiaopen/ooaf079","url":null,"abstract":"Objective: To develop and evaluate machine learning (ML) models for predicting length of stay (LOS) in elective spine surgery, with a focus on the benefits of temporal modeling and model interpretability.Materials and methods: We compared traditional ML models (eg, Linear Regression, Random Forest, Support Vector Machine [SVM], and XGBoost) with our developed model, SurgeryLSTM, a masked bidirectional long short-term memory (BiLSTM) with an attention, using structured perioperative electronic health records (EHR) data. Performance was evaluated using the coefficient of determination (R 2), and key predictors were identified using explainable AI.Results: SurgeryLSTM achieved the highest predictive accuracy (R 2 = 0.86), outperforming XGBoost (R 2 = 0.85) and baseline models. The attention mechanism improved interpretability by dynamically identifying influential temporal segments within preoperative clinical sequences, allowing clinicians to trace which events or features most contributed to each LOS prediction. Key predictors of LOS included bone disorder, chronic kidney disease, and lumbar fusion identified as the most impactful predictors of LOS.Discussion: Temporal modeling with attention mechanisms significantly improves LOS prediction by capturing the sequential nature of patient data. Unlike static models, SurgeryLSTM provides both higher accuracy and greater interpretability, which are critical for clinical adoption. These results highlight the potential of integrating attention-based temporal models into hospital planning workflows.Conclusion: SurgeryLSTM presents an effective and interpretable AI solution for LOS prediction in elective spine surgery. Our findings support the integration of temporal, explainable ML approaches into clinical decision support systems to enhance discharge readiness and individualized patient care.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 4","pages":"ooaf079"},"PeriodicalIF":3.4,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12292929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144733691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Electronic health record activity changes around new decision support implementation: monitoring using audit logs and topic modeling. 围绕新的决策支持实现的电子健康记录活动更改：使用审计日志和主题建模进行监视。

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-07-15 eCollection Date: 2025-08-01 DOI: 10.1093/jamiaopen/ooaf050

Jinying Chen, Sarah L Cutrona, Ajay Dharod, Adam Moses, Aaron Bridges, Brian Ostasiewski, Kristie L Foley, Thomas K Houston

Objectives: To develop and test a novel machine learning approach for monitoring impact of computerized clinical decision support (CDS) tools on clinicians' electronic health record (EHR) activities.

Materials and methods: Our CDS monitoring approach leverages topic modeling, a latent-variable statistical machine learning method, to infer health providers' EHR activities from EHR audit logs. We applied this approach to monitor the impact of a tobacco cessation support CDS tool newly implemented in 5 cancer clinics (2018-2021). We trained the topic model on EHR audit log data from 3445 encounters (pre-CDS-implementation: 1734, post-CDS-implementation: 1711) for patients with active smoking status. The number of topics was automatically determined based on within-topic coherence and across-topic divergence, and the identified topics were assigned clinically relevant EHR activity labels by 4 domain experts.

Results: The topic model identified 2 distinct activities focusing on CDS (act on CDS, bypass/postpone CDS), 2 activities related to CDS (review patient records and address alerts, use note templates and acknowledge the completion of CDS), 6 related to accessing (access patient station) and reviewing patient data (external records, synopsis data, snapshot of patient data, problem list/diagnosis/notes, treatment plan), and 4 related to modifying EHR (modify diagnosis/problem lists, document visit with record review, perform administrative activities for visit and billing, and document follow-up care plan). Comparing matched 1-hour after-check-in windows post-implementation (n = 841) versus pre-implementation (n = 841) of CDS, the mean prevalence (expressed as proportions out of 1.0) of providers' EHR-use activity increased on CDS-focused activities (0.073, 95% CI, 0.066-0.079) and CDS-related activities (0.098, 95% CI, 0.089-0.106) and decreased on modifying EHR (-0.113, 95% CI, -0.124 to -0.102) and reviewing patient data (-0.058, 95% CI, -0.072 to -0.044).

Discussion: Our topic model-based CDS monitoring approach can identify shifts in prevalence of EHR-use activities pre-implementation versus post-implementation. This approach can be applied to detect unintended changes in EHR activities on a large population scale following CDS implementation, providing valuable insights to guide focused qualitative investigations for CDS improvement or de-implementation.

Conclusion: Our approach offers a scalable, data-driven framework for evaluating the real-world impact of EHR-embedded CDS tools. Built on a generic machine learning framework, this approach could be adapted to explore impact of other healthcare quality improvement strategies using EHR-integrated CDS interventions.

目的：开发和测试一种新的机器学习方法，用于监测计算机临床决策支持（CDS）工具对临床医生电子健康记录（EHR）活动的影响。材料和方法：我们的CDS监测方法利用主题建模，一种潜在变量统计机器学习方法，从电子病历审计日志中推断医疗服务提供者的电子病历活动。我们应用这种方法来监测5家癌症诊所（2018-2021）新实施的戒烟支持CDS工具的影响。我们使用来自3445次就诊（实施cds前：1734次，实施cds后：1711次）的EHR审计日志数据来训练主题模型，这些患者均为主动吸烟状态。根据主题内一致性和跨主题差异性自动确定主题数量，并由4位领域专家为确定的主题分配临床相关的EHR活动标签。结果:主题模型确定了2个不同的活动（针对CDS采取行动，绕过/推迟CDS）， 2个与CDS相关的活动（审查患者记录和地址警报，使用笔记模板和确认CDS的完成），6个与访问（访问患者站）和审查患者数据（外部记录，概要数据，患者数据快照，问题列表/诊断/笔记，治疗计划）有关，4个与修改EHR(修改诊断/问题列表，记录来访记录，执行来访和账单的管理活动，并记录后续护理计划)。比较实施后（n = 841）与实施前（n = 841）的匹配1小时登记窗口后（n = 841），提供者使用电子病历活动的平均患病率（以1.0的比例表示）在以CDS为重点的活动（0.073,95% CI, 0.066-0.079）和CDS相关的活动（0.098,95% CI, 0.089-0.106）上增加，在修改电子病历（-0.113,95% CI, -0.124至-0.102）和审查患者数据（-0.058,95% CI, -0.072至-0.044）上减少。讨论：我们基于主题模型的CDS监测方法可以识别实施前与实施后电子病历使用活动的流行变化。该方法可用于检测CDS实施后大规模人群中电子病历活动的意外变化，为指导CDS改进或取消实施的重点定性调查提供有价值的见解。结论：我们的方法为评估ehr嵌入式CDS工具的实际影响提供了一个可扩展的、数据驱动的框架。该方法建立在通用机器学习框架的基础上，可用于探索使用ehr集成CDS干预措施的其他医疗保健质量改进策略的影响。

{"title":"Electronic health record activity changes around new decision support implementation: monitoring using audit logs and topic modeling.","authors":"Jinying Chen, Sarah L Cutrona, Ajay Dharod, Adam Moses, Aaron Bridges, Brian Ostasiewski, Kristie L Foley, Thomas K Houston","doi":"10.1093/jamiaopen/ooaf050","DOIUrl":"10.1093/jamiaopen/ooaf050","url":null,"abstract":"Objectives: To develop and test a novel machine learning approach for monitoring impact of computerized clinical decision support (CDS) tools on clinicians' electronic health record (EHR) activities.Materials and methods: Our CDS monitoring approach leverages topic modeling, a latent-variable statistical machine learning method, to infer health providers' EHR activities from EHR audit logs. We applied this approach to monitor the impact of a tobacco cessation support CDS tool newly implemented in 5 cancer clinics (2018-2021). We trained the topic model on EHR audit log data from 3445 encounters (pre-CDS-implementation: 1734, post-CDS-implementation: 1711) for patients with active smoking status. The number of topics was automatically determined based on within-topic coherence and across-topic divergence, and the identified topics were assigned clinically relevant EHR activity labels by 4 domain experts.Results: The topic model identified 2 distinct activities focusing on CDS (act on CDS, bypass/postpone CDS), 2 activities related to CDS (review patient records and address alerts, use note templates and acknowledge the completion of CDS), 6 related to accessing (access patient station) and reviewing patient data (external records, synopsis data, snapshot of patient data, problem list/diagnosis/notes, treatment plan), and 4 related to modifying EHR (modify diagnosis/problem lists, document visit with record review, perform administrative activities for visit and billing, and document follow-up care plan). Comparing matched 1-hour after-check-in windows post-implementation (n = 841) versus pre-implementation (n = 841) of CDS, the mean prevalence (expressed as proportions out of 1.0) of providers' EHR-use activity increased on CDS-focused activities (0.073, 95% CI, 0.066-0.079) and CDS-related activities (0.098, 95% CI, 0.089-0.106) and decreased on modifying EHR (-0.113, 95% CI, -0.124 to -0.102) and reviewing patient data (-0.058, 95% CI, -0.072 to -0.044).Discussion: Our topic model-based CDS monitoring approach can identify shifts in prevalence of EHR-use activities pre-implementation versus post-implementation. This approach can be applied to detect unintended changes in EHR activities on a large population scale following CDS implementation, providing valuable insights to guide focused qualitative investigations for CDS improvement or de-implementation.Conclusion: Our approach offers a scalable, data-driven framework for evaluating the real-world impact of EHR-embedded CDS tools. Built on a generic machine learning framework, this approach could be adapted to explore impact of other healthcare quality improvement strategies using EHR-integrated CDS interventions.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 4","pages":"ooaf050"},"PeriodicalIF":3.4,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12536917/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145348982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cross-institutional dental electronic health record entity extraction via generative artificial intelligence and synthetic notes. 通过生成人工智能和合成笔记提取跨机构牙科电子健康记录实体。

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-28 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf061

Yao-Shun Chuang, Chun-Teh Lee, Guo-Hao Lin, Ryan Brandon, Xiaoqian Jiang, Muhammad F Walji, Oluwabunmi Tokede

Background: While most health-care providers now use electronic health records (EHRs) to document clinical care, many still treat them as digital versions of paper records. As a result, documentation often remains unstructured, with free-text entries in progress notes. This limits the potential for secondary use and analysis, as machine-learning and data analysis algorithms are more effective with structured data.

Objective: This study aims to use advanced artificial intelligence (AI) and natural language processing (NLP) techniques to improve diagnostic information extraction from clinical notes in a periodontal use case. By automating this process, the study seeks to reduce missing data in dental records and minimize the need for extensive manual annotation, a long-standing barrier to widespread NLP deployment in dental data extraction.

Materials and methods: This research utilizes large language models (LLMs), specifically Generative Pretrained Transformer 4, to generate synthetic medical notes for fine-tuning a RoBERTa model. This model was trained to better interpret and process dental language, with particular attention to periodontal diagnoses. Model performance was evaluated by manually reviewing 360 clinical notes randomly selected from each of the participating site's dataset.

Results: The results demonstrated high accuracy of periodontal diagnosis data extraction, with the sites 1 and 2 achieving a weighted average score of 0.97-0.98. This performance held for all dimensions of periodontal diagnosis in terms of stage, grade, and extent.

Discussion: Synthetic data effectively reduced manual annotation needs while preserving model quality. Generalizability across institutions suggests viability for broader adoption, though future work is needed to improve contextual understanding.

Conclusion: The study highlights the potential transformative impact of AI and NLP on health-care research. Most clinical documentation (40%-80%) is free text. Scaling our method could enhance clinical data reuse.

背景：虽然大多数医疗保健提供者现在使用电子健康记录（EHRs）来记录临床护理，但许多人仍然将其视为纸质记录的数字版本。因此，文档通常是非结构化的，在进度记录中有自由文本条目。这限制了二次使用和分析的潜力，因为机器学习和数据分析算法对结构化数据更有效。目的：本研究旨在利用先进的人工智能（AI）和自然语言处理（NLP）技术来改进牙周病例临床记录的诊断信息提取。通过自动化这一过程，该研究旨在减少牙科记录中的缺失数据，并最大限度地减少对大量人工注释的需求，这是在牙科数据提取中广泛部署NLP的长期障碍。材料和方法：本研究利用大型语言模型（llm），特别是生成预训练Transformer 4，生成用于微调RoBERTa模型的合成医学笔记。这个模型经过训练，可以更好地解释和处理牙科语言，特别注意牙周诊断。模型的性能通过人工审查从每个参与站点的数据集中随机选择的360个临床记录来评估。结果：牙周诊断数据提取的准确性较高，1、2位的加权平均得分为0.97 ~ 0.98。这种表现适用于牙周诊断的各个方面，包括阶段、等级和程度。讨论：合成数据在保持模型质量的同时有效地减少了手工注释需求。跨机构的概括性表明更广泛采用的可行性，尽管未来的工作需要提高对上下文的理解。结论：该研究突出了人工智能和NLP对医疗保健研究的潜在变革性影响。大多数临床文献（40%-80%）是免费文本。扩展我们的方法可以提高临床数据的重用。

{"title":"Cross-institutional dental electronic health record entity extraction via generative artificial intelligence and synthetic notes.","authors":"Yao-Shun Chuang, Chun-Teh Lee, Guo-Hao Lin, Ryan Brandon, Xiaoqian Jiang, Muhammad F Walji, Oluwabunmi Tokede","doi":"10.1093/jamiaopen/ooaf061","DOIUrl":"10.1093/jamiaopen/ooaf061","url":null,"abstract":"Background: While most health-care providers now use electronic health records (EHRs) to document clinical care, many still treat them as digital versions of paper records. As a result, documentation often remains unstructured, with free-text entries in progress notes. This limits the potential for secondary use and analysis, as machine-learning and data analysis algorithms are more effective with structured data.Objective: This study aims to use advanced artificial intelligence (AI) and natural language processing (NLP) techniques to improve diagnostic information extraction from clinical notes in a periodontal use case. By automating this process, the study seeks to reduce missing data in dental records and minimize the need for extensive manual annotation, a long-standing barrier to widespread NLP deployment in dental data extraction.Materials and methods: This research utilizes large language models (LLMs), specifically Generative Pretrained Transformer 4, to generate synthetic medical notes for fine-tuning a RoBERTa model. This model was trained to better interpret and process dental language, with particular attention to periodontal diagnoses. Model performance was evaluated by manually reviewing 360 clinical notes randomly selected from each of the participating site's dataset.Results: The results demonstrated high accuracy of periodontal diagnosis data extraction, with the sites 1 and 2 achieving a weighted average score of 0.97-0.98. This performance held for all dimensions of periodontal diagnosis in terms of stage, grade, and extent.Discussion: Synthetic data effectively reduced manual annotation needs while preserving model quality. Generalizability across institutions suggests viability for broader adoption, though future work is needed to improve contextual understanding.Conclusion: The study highlights the potential transformative impact of AI and NLP on health-care research. Most clinical documentation (40%-80%) is free text. Scaling our method could enhance clinical data reuse.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf061"},"PeriodicalIF":3.4,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12205731/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measles Tracker: a near-real-time data hub for measles surveillance. 麻疹追踪器：麻疹监测的近实时数据中心。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-27 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf062

Francesco Branda, Maria Tomasso, Mohamed Mustaf Ahmed, Massimo Ciccozzi, Fabio Scarpa

Objectives: Measles continues to pose a serious threat to global public health, fueled by declining vaccination rates, international travel, and persistent immunization gaps. Early outbreak detection and response remain hampered by fragmented surveillance systems, which often lack interoperability and limit data accessibility.

Materials and methods: To address the major limitations of current measles surveillance systems-including data fragmentation and lack of standardization-we developed Measles Tracker, an integrated near-real-time data hub that centralizes and harmonizes measles surveillance data in the United States using publicly available sources. The system aggregates data from multiple layers, including: (1) official reports from public health agencies, (2) epidemiological surveillance bulletins, and (3) outbreak reports, mainly captured through news websites or via news aggregators. The platform architecture implements (1) geospatial normalization of key epidemiological variables (case counts, vaccination coverage, age-stratified incidence) and (2) dynamic visualization interfaces to support coordination of evidence-based response.

Results: Measles Tracker enhances situational awareness by integrating disparate data streams in near real-time, enabling rapid geospatial detection of outbreak clusters, mapping vaccination gaps, and supporting dynamic risk stratification of vulnerable populations. It is intended exclusively as a complementary tool to official public health systems, providing educational and situational awareness without interfering with contact tracing, vaccination, or outbreak control activities.

Conclusions: As a centralized, scalable tool, Measles Tracker advances measles surveillance by leveraging digital epidemiology principles. Future iterations will incorporate additional data streams (eg, climate variables, genomic surveillance) and advanced analytics (eg, machine learning for risk prediction, network models for transmission dynamics) to further optimize outbreak preparedness and resource allocation. This framework underscores the transformative potential of integrated data systems in global measles elimination efforts.

目标：由于疫苗接种率下降、国际旅行和免疫差距持续存在，麻疹继续对全球公共卫生构成严重威胁。早期发现和应对疫情仍然受到分散的监测系统的阻碍，这些系统往往缺乏互操作性，限制了数据的可访问性。材料和方法：为了解决当前麻疹监测系统的主要局限性，包括数据碎片化和缺乏标准化，我们开发了麻疹追踪器，这是一个综合的近实时数据中心，利用公开来源集中和协调美国的麻疹监测数据。该系统收集了多个层面的数据，包括：(1)公共卫生机构的官方报告，(2)流行病学监测公报，(3)疫情报告，主要通过新闻网站或新闻聚合器获取。该平台架构实现了(1)关键流行病学变量（病例数、疫苗接种覆盖率、年龄分层发病率）的地理空间归一化和(2)动态可视化界面，以支持循证应对的协调。结果：麻疹追踪器通过近乎实时地整合不同的数据流，增强态势感知能力，实现疫情集群的快速地理空间检测，绘制疫苗接种差距，并支持弱势群体的动态风险分层。它完全是作为官方公共卫生系统的补充工具，在不干扰接触者追踪、疫苗接种或疫情控制活动的情况下提供教育和态势感知。结论：作为一种集中式、可扩展的工具，麻疹追踪器通过利用数字流行病学原理推进麻疹监测。未来的迭代将纳入更多的数据流（例如，气候变量、基因组监测）和高级分析（例如，用于风险预测的机器学习、传播动力学的网络模型），以进一步优化疫情准备和资源分配。该框架强调了综合数据系统在全球消除麻疹工作中的变革潜力。

{"title":"Measles Tracker: a near-real-time data hub for measles surveillance.","authors":"Francesco Branda, Maria Tomasso, Mohamed Mustaf Ahmed, Massimo Ciccozzi, Fabio Scarpa","doi":"10.1093/jamiaopen/ooaf062","DOIUrl":"10.1093/jamiaopen/ooaf062","url":null,"abstract":"Objectives: Measles continues to pose a serious threat to global public health, fueled by declining vaccination rates, international travel, and persistent immunization gaps. Early outbreak detection and response remain hampered by fragmented surveillance systems, which often lack interoperability and limit data accessibility.Materials and methods: To address the major limitations of current measles surveillance systems-including data fragmentation and lack of standardization-we developed Measles Tracker, an integrated near-real-time data hub that centralizes and harmonizes measles surveillance data in the United States using publicly available sources. The system aggregates data from multiple layers, including: (1) official reports from public health agencies, (2) epidemiological surveillance bulletins, and (3) outbreak reports, mainly captured through news websites or via news aggregators. The platform architecture implements (1) geospatial normalization of key epidemiological variables (case counts, vaccination coverage, age-stratified incidence) and (2) dynamic visualization interfaces to support coordination of evidence-based response.Results: Measles Tracker enhances situational awareness by integrating disparate data streams in near real-time, enabling rapid geospatial detection of outbreak clusters, mapping vaccination gaps, and supporting dynamic risk stratification of vulnerable populations. It is intended exclusively as a complementary tool to official public health systems, providing educational and situational awareness without interfering with contact tracing, vaccination, or outbreak control activities.Conclusions: As a centralized, scalable tool, Measles Tracker advances measles surveillance by leveraging digital epidemiology principles. Future iterations will incorporate additional data streams (eg, climate variables, genomic surveillance) and advanced analytics (eg, machine learning for risk prediction, network models for transmission dynamics) to further optimize outbreak preparedness and resource allocation. This framework underscores the transformative potential of integrated data systems in global measles elimination efforts.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf062"},"PeriodicalIF":2.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12203508/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comparative analysis of machine learning models and human expertise for nursing intervention classification. 护理干预分类中机器学习模型与人类专业知识的比较分析。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-27 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf057

Jerome Niyirora, Lynne Longtin, Cynthia Grabski, David Patrishkoff, Andriana Semko

Objective: This study compares the performance of machine learning (ML) models and human experts in mapping unstructured nursing notes to the standardized Nursing Interventions Classification (NIC) system. The aim is to advance automated nursing documentation classification, facilitating cross-facility benchmarking of patient care and organizational outcomes.

Materials and methods: We developed and compared 4 ML models: TF-IDF text-based vectorization, UMLS semantic mapping, fine-tuned GPT-4o mini, and Bio-Clinical BERT. These models were evaluated against classifications provided by 2 expert nurses using a dataset of de-identified home healthcare nursing notes obtained from a Florida, USA-based medical clearinghouse. Model performance was assessed using agreement statistics, precision, recall, F1 scores, and Cohen's Kappa.

Results: Human raters achieved the highest agreement with consensus labels, scoring 0.75 and 0.62, with corresponding F1 scores of 0.61 and 0.45, respectively. In comparison, ML models showed lower performance, with GPT achieving the best among them (agreement: 0.50, F1 score: 0.31). A distribution analysis of NIC categories revealed that ML models performed well in prevalent and clearly defined categories, such as drug management, but struggled with minority classes and context-dependent interventions, like information management.

Discussion: Current ML approaches show promise in supporting clinical classification tasks, but the performance gap in handling complex, context-dependent interventions highlights the need for improved methods that can better capture the nuanced nature of clinical documentation. Future research should focus on developing methods to process clinical terminology and context-specific documentation with greater precision and adaptability.

Conclusion: Current ML models can aid-but not fully replace-human judgment in classifying nuanced nursing interventions.

目的：比较机器学习（ML）模型和人类专家在将非结构化护理笔记映射到标准化护理干预分类（NIC）系统中的表现。目的是推进自动化护理文件分类，促进患者护理和组织结果的跨设施基准。材料和方法：我们开发并比较了4种ML模型：基于TF-IDF文本的矢量化，UMLS语义映射，微调gpt - 40mini和生物临床BERT。这些模型是根据2名专家护士提供的分类进行评估的，这些分类使用了从美国佛罗里达州的医疗信息交换所获得的去识别的家庭保健护理笔记数据集。使用协议统计、精度、召回率、F1分数和Cohen’s Kappa来评估模型性能。结果：人类评分者与共识标签的一致性最高，得分分别为0.75和0.62，相应的F1得分分别为0.61和0.45。相比之下，ML模型的性能较低，其中GPT达到最佳（一致性：0.50，F1分数：0.31）。对NIC类别的分布分析显示，ML模型在流行和明确定义的类别（如药物管理）中表现良好，但在少数类别和上下文相关干预（如信息管理）中表现不佳。讨论：当前的机器学习方法在支持临床分类任务方面显示出希望，但是在处理复杂的、上下文相关的干预措施方面的性能差距突出了对改进方法的需求，这些方法可以更好地捕捉临床文档的细微差别。未来的研究应侧重于开发处理临床术语和上下文特定文件的方法，以更高的精度和适应性。结论：目前的机器学习模型可以帮助-但不能完全取代-人类对细致护理干预的分类判断。

{"title":"A comparative analysis of machine learning models and human expertise for nursing intervention classification.","authors":"Jerome Niyirora, Lynne Longtin, Cynthia Grabski, David Patrishkoff, Andriana Semko","doi":"10.1093/jamiaopen/ooaf057","DOIUrl":"10.1093/jamiaopen/ooaf057","url":null,"abstract":"Objective: This study compares the performance of machine learning (ML) models and human experts in mapping unstructured nursing notes to the standardized Nursing Interventions Classification (NIC) system. The aim is to advance automated nursing documentation classification, facilitating cross-facility benchmarking of patient care and organizational outcomes.Materials and methods: We developed and compared 4 ML models: TF-IDF text-based vectorization, UMLS semantic mapping, fine-tuned GPT-4o mini, and Bio-Clinical BERT. These models were evaluated against classifications provided by 2 expert nurses using a dataset of de-identified home healthcare nursing notes obtained from a Florida, USA-based medical clearinghouse. Model performance was assessed using agreement statistics, precision, recall, F1 scores, and Cohen's Kappa.Results: Human raters achieved the highest agreement with consensus labels, scoring 0.75 and 0.62, with corresponding F1 scores of 0.61 and 0.45, respectively. In comparison, ML models showed lower performance, with GPT achieving the best among them (agreement: 0.50, F1 score: 0.31). A distribution analysis of NIC categories revealed that ML models performed well in prevalent and clearly defined categories, such as drug management, but struggled with minority classes and context-dependent interventions, like information management.Discussion: Current ML approaches show promise in supporting clinical classification tasks, but the performance gap in handling complex, context-dependent interventions highlights the need for improved methods that can better capture the nuanced nature of clinical documentation. Future research should focus on developing methods to process clinical terminology and context-specific documentation with greater precision and adaptability.Conclusion: Current ML models can aid-but not fully replace-human judgment in classifying nuanced nursing interventions.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf057"},"PeriodicalIF":2.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12203540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of the International Classification of Health Interventions for coding interventions in adults with sensorineural hearing loss. 国际健康干预分类在成人感音神经性听力损失患者编码干预中的应用。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-27 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf063

Faheema Mahomed-Asmail, Ilze Oosthuizen, Catherine Sykes, Soraya Maart, Richard Madden, De Wet Swanepoel, Vinaya Manchaiah

Objective: The International Classification of Health Interventions (ICHI), currently being developed, seeks to span all sectors of the health system. Our objective was to determine the coverage of the ICHI for hearing interventions commonly delivered to adults with sensorineural hearing loss (SNHL).

Material and methods: A 3-phase content mapping method was used, which included (1) identification of source terms with an expert panel in audiology rehabilitation; (2) 3 coders independently applied the classification to the source terms; and (3) the coders reached a consensus for each intervention and identified reasons for initial discrepancies with options not linked to a specific code were identified.

Results: Nineteen different ICHI Target categories were identified, with 23 different ICHI Action categories and 82% of the means being "Other and unspecified." There was consensus in codes for 54.3% of source terms, with no ICHI code found for 8.5% of source terms. The greatest number of discrepancies arose from the action, followed by the target. Coding discrepancies occurred as a result of misunderstanding of source terms, the clinical use thereof, and difficulty determining the type of Target.

Discussion: Despite its broad scope, ICHI's current framework has gaps in its coverage of audiological interventions, particularly those related to sensorineural hearing loss. Addressing these gaps is crucial for improving global data standardization and facilitating the development of more targeted hearing health policies.

Conclusion: This study makes an important contribution to the further development and refinement of the classification, specifically in the context of hearing healthcare.

目标：目前正在制定的《国际卫生干预措施分类》力求涵盖卫生系统的所有部门。我们的目的是确定ICHI对成人感音神经性听力损失（SNHL）听力干预的覆盖范围。材料和方法：采用三阶段内容映射法，包括(1)与听力学康复专家小组识别源项；(2) 3个编码器独立对源项进行分类；(3)编码人员对每个干预措施达成共识，并确定了与特定代码不相关的选项初始差异的原因。结果：确定了19种不同的ICHI目标类别，23种不同的ICHI动作类别，82%的手段是“其他和未指定的”。54.3%的源项的代码是一致的，8.5%的源项没有找到ICHI代码。最多的差异来自行动，其次是目标。编码差异的发生是由于对源术语的误解、临床使用以及难以确定目标类型造成的。讨论：尽管其范围广泛，但ICHI目前的框架在听力学干预方面存在差距，特别是与感音神经性听力损失相关的听力学干预。解决这些差距对于改善全球数据标准化和促进制定更有针对性的听力卫生政策至关重要。结论：本研究为进一步发展和完善该分类，特别是在听力保健方面做出了重要贡献。

{"title":"Application of the International Classification of Health Interventions for coding interventions in adults with sensorineural hearing loss.","authors":"Faheema Mahomed-Asmail, Ilze Oosthuizen, Catherine Sykes, Soraya Maart, Richard Madden, De Wet Swanepoel, Vinaya Manchaiah","doi":"10.1093/jamiaopen/ooaf063","DOIUrl":"10.1093/jamiaopen/ooaf063","url":null,"abstract":"Objective: The International Classification of Health Interventions (ICHI), currently being developed, seeks to span all sectors of the health system. Our objective was to determine the coverage of the ICHI for hearing interventions commonly delivered to adults with sensorineural hearing loss (SNHL).Material and methods: A 3-phase content mapping method was used, which included (1) identification of source terms with an expert panel in audiology rehabilitation; (2) 3 coders independently applied the classification to the source terms; and (3) the coders reached a consensus for each intervention and identified reasons for initial discrepancies with options not linked to a specific code were identified.Results: Nineteen different ICHI Target categories were identified, with 23 different ICHI Action categories and 82% of the means being \"Other and unspecified.\" There was consensus in codes for 54.3% of source terms, with no ICHI code found for 8.5% of source terms. The greatest number of discrepancies arose from the action, followed by the target. Coding discrepancies occurred as a result of misunderstanding of source terms, the clinical use thereof, and difficulty determining the type of Target.Discussion: Despite its broad scope, ICHI's current framework has gaps in its coverage of audiological interventions, particularly those related to sensorineural hearing loss. Addressing these gaps is crucial for improving global data standardization and facilitating the development of more targeted hearing health policies.Conclusion: This study makes an important contribution to the further development and refinement of the classification, specifically in the context of hearing healthcare.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf063"},"PeriodicalIF":2.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12203548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Establishing data governance for sharing and access to real-world data: a case study. 为共享和访问真实数据建立数据治理：一个案例研究。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-23 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf041

Heath A Davis, Diva Kerkman, Asher A Hoberg, Michele Countryman, Wendy Beaver, Kiley Bybee, James M Blum, Boyd M Knosp

Importance: Data governance, the policies, and procedures for managing data, is a critical factor for secondary use of clinical data for research.

Objectives: This paper describes the evolution of an academic health-care organization's data governance for research, development of an external data sharing process, implementation of related processes, continuous improvement, and ongoing observations of data governance maturity.

Materials and methods: The program was designed to improve the access to and sharing of real-world data for research. Using a combination of qualitative and quantitative methods, we evaluated the program's effectiveness.

Results: Our results describe a significant improvement in data accessibility as seen in new data-driven performance indicators and in data understanding indicated by new processes, policies, and strategies.

Discussion: The paper outlines the development of a data governance process at an academic health center to support external data sharing, emphasizing the importance of data literacy, cross-office collaboration, and structured workflows to manage complex review requirements. The formalized process improved data access, identified gaps, and enabled continuous quality improvement, though it introduced new bottlenecks and required navigating multi-office reviews and researcher education.

Conclusion: These findings suggest data governance practices that may apply to other institutions.

重要性：数据治理，即管理数据的政策和程序，是临床数据用于研究的二次使用的关键因素。目的：本文描述了学术医疗保健组织用于研究的数据治理的演变、外部数据共享流程的开发、相关流程的实施、持续改进以及对数据治理成熟度的持续观察。材料和方法：该计划旨在改善对真实世界研究数据的访问和共享。采用定性和定量相结合的方法，我们评估了该计划的有效性。结果：我们的结果描述了数据可访问性的显著改善，这体现在新的数据驱动性能指标和新流程、政策和战略所指示的数据理解上。讨论：本文概述了在学术医疗中心开发数据治理流程以支持外部数据共享，强调了数据素养、跨办公室协作和结构化工作流程的重要性，以管理复杂的审查需求。形式化的过程改进了数据访问，确定了差距，并实现了持续的质量改进，尽管它引入了新的瓶颈，并需要导航多办公室审查和研究人员教育。结论：这些发现表明数据治理实践可能适用于其他机构。

{"title":"Establishing data governance for sharing and access to real-world data: a case study.","authors":"Heath A Davis, Diva Kerkman, Asher A Hoberg, Michele Countryman, Wendy Beaver, Kiley Bybee, James M Blum, Boyd M Knosp","doi":"10.1093/jamiaopen/ooaf041","DOIUrl":"10.1093/jamiaopen/ooaf041","url":null,"abstract":"Importance: Data governance, the policies, and procedures for managing data, is a critical factor for secondary use of clinical data for research.Objectives: This paper describes the evolution of an academic health-care organization's data governance for research, development of an external data sharing process, implementation of related processes, continuous improvement, and ongoing observations of data governance maturity.Materials and methods: The program was designed to improve the access to and sharing of real-world data for research. Using a combination of qualitative and quantitative methods, we evaluated the program's effectiveness.Results: Our results describe a significant improvement in data accessibility as seen in new data-driven performance indicators and in data understanding indicated by new processes, policies, and strategies.Discussion: The paper outlines the development of a data governance process at an academic health center to support external data sharing, emphasizing the importance of data literacy, cross-office collaboration, and structured workflows to manage complex review requirements. The formalized process improved data access, identified gaps, and enabled continuous quality improvement, though it introduced new bottlenecks and required navigating multi-office reviews and researcher education.Conclusion: These findings suggest data governance practices that may apply to other institutions.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf041"},"PeriodicalIF":2.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206003/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation of falls detected by natural language processing algorithm and not coded external cause of morbidity. 评估由自然语言处理算法检测的跌倒，没有编码的外部致病原因。

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-20 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf047

Daniel J Hekman, Apoorva P Maru, Hanna J Barton, Douglas Wiegmann, Manish N Shah, Amy L Cochran, Erkin Ötleş, Brian W Patterson

Objective: Falls are a leading cause of morbidity and mortality among older adults. Common methods for identifying fall-related ED visits within both claims and electronic health record datasets rely on diagnosis code-based definitions, which underestimate the true prevalence of falls. This study applies a natural language processing (NLP) algorithm to ED provider notes to identify patients presenting due to falls and compares the characteristics of NLP-identified cases to those identified through diagnosis codes to identify the impact of identification strategy.

Materials and methods: This cross-sectional study analyzed ED encounter data from older adult patients who visited an ED between December 2016 and 2020. The NLP algorithm identified falls based on provider notes, searching for keywords related to falls and excluding negated and spurious matches. We also applied common ICD code methods to identify falls.

Results: We processed 50 153 ED encounters and the NLP approach identified 14 604 encounters for patients who fell. Of those, 7086 (49%) were not identified using external cause of morbidity ICD codes. Patients identified by just the NLP algorithm exhibited higher Elixhauser comorbidity scores and increased likelihood of 30-day mortality. Patients identified by NLP algorithm but not ICD codes were more likely to have severe underlying conditions such as sepsis or acute kidney disease rather than traumatic injuries.

Discussion: The NLP algorithm identifies many fall-related visits not identified by traditional methods.

Conclusion: If the causal relationships between falls and comorbid conditions are not considered in NLP algorithms, they can easily identify patients who fell, but the fall was a sequela of underlying medical illness.

目的：跌倒是老年人发病和死亡的主要原因。在索赔和电子健康记录数据集中识别与跌倒相关的急诊科就诊的常用方法依赖于基于诊断代码的定义，这低估了跌倒的真实患病率。本研究将自然语言处理（NLP）算法应用于急诊医生的记录，以识别因跌倒而就诊的患者，并将NLP识别的病例的特征与通过诊断代码识别的病例的特征进行比较，以确定识别策略的影响。材料和方法：本横断面研究分析了2016年12月至2020年12月期间访问ED的老年患者的ED遭遇数据。NLP算法根据提供者的说明识别瀑布，搜索与瀑布相关的关键字，并排除否定和虚假匹配。我们还应用了常见的ICD编码方法来识别跌倒。结果：我们处理了50 153例ED遭遇，NLP方法确定了14 604例跌倒患者。其中，7086例（49%）未使用ICD编码确定发病外因。仅通过NLP算法识别的患者表现出更高的Elixhauser合并症评分和30天死亡率增加的可能性。通过NLP算法而非ICD代码识别的患者更有可能患有严重的潜在疾病，如败血症或急性肾脏疾病，而不是创伤性损伤。讨论：NLP算法识别了许多传统方法无法识别的与跌倒相关的访问。结论：如果在NLP算法中不考虑跌倒与合并症之间的因果关系，它们可以很容易地识别跌倒的患者，但跌倒是潜在医学疾病的后遗症。

{"title":"Evaluation of falls detected by natural language processing algorithm and not coded external cause of morbidity.","authors":"Daniel J Hekman, Apoorva P Maru, Hanna J Barton, Douglas Wiegmann, Manish N Shah, Amy L Cochran, Erkin Ötleş, Brian W Patterson","doi":"10.1093/jamiaopen/ooaf047","DOIUrl":"10.1093/jamiaopen/ooaf047","url":null,"abstract":"Objective: Falls are a leading cause of morbidity and mortality among older adults. Common methods for identifying fall-related ED visits within both claims and electronic health record datasets rely on diagnosis code-based definitions, which underestimate the true prevalence of falls. This study applies a natural language processing (NLP) algorithm to ED provider notes to identify patients presenting due to falls and compares the characteristics of NLP-identified cases to those identified through diagnosis codes to identify the impact of identification strategy.Materials and methods: This cross-sectional study analyzed ED encounter data from older adult patients who visited an ED between December 2016 and 2020. The NLP algorithm identified falls based on provider notes, searching for keywords related to falls and excluding negated and spurious matches. We also applied common ICD code methods to identify falls.Results: We processed 50 153 ED encounters and the NLP approach identified 14 604 encounters for patients who fell. Of those, 7086 (49%) were not identified using external cause of morbidity ICD codes. Patients identified by just the NLP algorithm exhibited higher Elixhauser comorbidity scores and increased likelihood of 30-day mortality. Patients identified by NLP algorithm but not ICD codes were more likely to have severe underlying conditions such as sepsis or acute kidney disease rather than traumatic injuries.Discussion: The NLP algorithm identifies many fall-related visits not identified by traditional methods.Conclusion: If the causal relationships between falls and comorbid conditions are not considered in NLP algorithms, they can easily identify patients who fell, but the fall was a sequela of underlying medical illness.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf047"},"PeriodicalIF":3.4,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204729/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reproducible generative artificial intelligence evaluation for health care: a clinician-in-the-loop approach. 医疗保健的可再生生成人工智能评估：临床医生在循环中的方法。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-06-16 eCollection Date: 2025-06-01 DOI: 10.1093/jamiaopen/ooaf054

Leah Livingston, Amber Featherstone-Uwague, Amanda Barry, Kenneth Barretto, Tara Morey, Drahomira Herrmannova, Venkatesh Avula

Objectives: To develop and apply a reproducible methodology for evaluating generative artificial intelligence (AI) powered systems in health care, addressing the gap between theoretical evaluation frameworks and practical implementation guidance.

Materials and methods: A 5-dimension evaluation framework was developed to assess query comprehension and response helpfulness, correctness, completeness, and potential clinical harm. The framework was applied to evaluate ClinicalKey AI using queries drawn from user logs, a benchmark dataset, and subject matter expert curated queries. Forty-one board-certified physicians and pharmacists were recruited to independently evaluate query-response pairs. An agreement protocol using the mode and modified Delphi method resolved disagreements in evaluation scores.

Results: Of 633 queries, 614 (96.99%) produced evaluable responses, with subject matter experts completing evaluations of 426 query-response pairs. Results demonstrated high rates of response correctness (95.5%) and query comprehension (98.6%), with 94.4% of responses rated as helpful. Two responses (0.47%) received scores indicating potential clinical harm. Pairwise consensus occurred in 60.6% of evaluations, with remaining cases requiring third tie-breaker review.

Discussion: The framework demonstrated effectiveness in quantifying performance through comprehensive evaluation dimensions and structured scoring resolution methods. Key strengths included representative query sampling, standardized rating scales, and robust subject matter expert agreement protocols. Challenges emerged in managing subjective assessments of open-ended responses and achieving consensus on potential harm classification.

Conclusion: This framework offers a reproducible methodology for evaluating health-care generative AI systems, establishing foundational processes that can inform future efforts while supporting the implementation of generative AI applications in clinical settings.

目的：开发和应用一种可重复的方法来评估卫生保健中的生成式人工智能（AI）驱动系统，解决理论评估框架和实际实施指导之间的差距。材料和方法：开发了一个5维评估框架来评估查询理解和响应的帮助性、正确性、完整性和潜在的临床危害。使用从用户日志、基准数据集和主题专家策划的查询中提取的查询，应用该框架来评估ClinicalKey AI。41名委员会认证的医生和药剂师被招募来独立评估询问-回应对。采用模型和改进的德尔菲法的协议协议解决了评价分数的分歧。结果：在633个查询中，614个（96.99%）产生了可评估的回复，主题专家完成了426个查询-回复对的评估。结果显示了较高的回答正确性（95.5%）和查询理解率（98.6%），其中94.4%的回答被评为有帮助。2个应答（0.47%）获得潜在临床危害评分。60.6%的评估出现两两共识，其余病例需要第三次决胜审查。讨论：该框架通过综合评价维度和结构化评分解决方法证明了量化绩效的有效性。主要优势包括代表性查询抽样、标准化评级尺度和健壮的主题专家协议协议。在管理开放式答复的主观评估和就潜在危害分类达成共识方面出现了挑战。结论：该框架为评估卫生保健生成式人工智能系统提供了可重复的方法，建立了基础流程，可以为未来的工作提供信息，同时支持在临床环境中实施生成式人工智能应用。

{"title":"Reproducible generative artificial intelligence evaluation for health care: a clinician-in-the-loop approach.","authors":"Leah Livingston, Amber Featherstone-Uwague, Amanda Barry, Kenneth Barretto, Tara Morey, Drahomira Herrmannova, Venkatesh Avula","doi":"10.1093/jamiaopen/ooaf054","DOIUrl":"10.1093/jamiaopen/ooaf054","url":null,"abstract":"Objectives: To develop and apply a reproducible methodology for evaluating generative artificial intelligence (AI) powered systems in health care, addressing the gap between theoretical evaluation frameworks and practical implementation guidance.Materials and methods: A 5-dimension evaluation framework was developed to assess query comprehension and response helpfulness, correctness, completeness, and potential clinical harm. The framework was applied to evaluate ClinicalKey AI using queries drawn from user logs, a benchmark dataset, and subject matter expert curated queries. Forty-one board-certified physicians and pharmacists were recruited to independently evaluate query-response pairs. An agreement protocol using the mode and modified Delphi method resolved disagreements in evaluation scores.Results: Of 633 queries, 614 (96.99%) produced evaluable responses, with subject matter experts completing evaluations of 426 query-response pairs. Results demonstrated high rates of response correctness (95.5%) and query comprehension (98.6%), with 94.4% of responses rated as helpful. Two responses (0.47%) received scores indicating potential clinical harm. Pairwise consensus occurred in 60.6% of evaluations, with remaining cases requiring third tie-breaker review.Discussion: The framework demonstrated effectiveness in quantifying performance through comprehensive evaluation dimensions and structured scoring resolution methods. Key strengths included representative query sampling, standardized rating scales, and robust subject matter expert agreement protocols. Challenges emerged in managing subjective assessments of open-ended responses and achieving consensus on potential harm classification.Conclusion: This framework offers a reproducible methodology for evaluating health-care generative AI systems, establishing foundational processes that can inform future efforts while supporting the implementation of generative AI applications in clinical settings.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf054"},"PeriodicalIF":2.5,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12169418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0