首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
Benchmarking LLMs for hospital-course summarisation: aligning metrics with clinical factuality, safety, and robustness. 标杆法学硕士的医院课程总结:调整指标与临床事实,安全性和稳健性。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-10 DOI: 10.1093/jamia/ocag023
Jose Zablah, Yolly Molina, Antonio Garcia Loureiro
{"title":"Benchmarking LLMs for hospital-course summarisation: aligning metrics with clinical factuality, safety, and robustness.","authors":"Jose Zablah, Yolly Molina, Antonio Garcia Loureiro","doi":"10.1093/jamia/ocag023","DOIUrl":"https://doi.org/10.1093/jamia/ocag023","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence-based medicine on FHIR augments the standards-based approach to digital health research. 基于FHIR的循证医学增强了基于标准的数字健康研究方法。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-10 DOI: 10.1093/jamia/ocag024
Brian S Alper, Joanne Dehnbostel, Harold Lehmann
{"title":"Evidence-based medicine on FHIR augments the standards-based approach to digital health research.","authors":"Brian S Alper, Joanne Dehnbostel, Harold Lehmann","doi":"10.1093/jamia/ocag024","DOIUrl":"https://doi.org/10.1093/jamia/ocag024","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Letter to the Editor in response to "Optimizing example-selection in retrieval-augmented biomedical in-context learning: reflections on the MMRAG study". 致编辑的回复“优化检索增强生物医学语境学习中的例子选择:对MMRAG研究的反思”的信。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-09 DOI: 10.1093/jamia/ocaf237
Zaifu Zhan, Rui Zhang
{"title":"Letter to the Editor in response to \"Optimizing example-selection in retrieval-augmented biomedical in-context learning: reflections on the MMRAG study\".","authors":"Zaifu Zhan, Rui Zhang","doi":"10.1093/jamia/ocaf237","DOIUrl":"10.1093/jamia/ocaf237","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146202787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing ambient scribes: a randomized crossover clinical trial addressing ambient scribe technologies' impact on physician burnout. 比较环境抄写器:一项随机交叉临床试验,解决环境抄写器技术对医生职业倦怠的影响。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-06 DOI: 10.1093/jamia/ocag018
Anand Chowdhury, Michele Casey, Jonathan Wilson, Kathryn I Pollak, Benjamin A Goldstein, Armando Bedoya, Eric G Poon

Objective: This study aims to compare the effectiveness of 2 ambient AI scribe technologies in reducing physician burnout, improving workflow satisfaction, and enhancing documentation efficiency through a randomized crossover trial.

Materials and methods: An open-label randomized crossover trial involving 160 outpatient clinicians was conducted at a tertiary academic medical center. Volunteers were randomized to 2 groups of 80 with 2 crossover periods. We assessed workflow satisfaction (1-7 scale), burnout (Copenhagen Burnout Index), and efficiency metrics (eg, electronic health record time outside scheduled hours, documentation time, etc.). Data was analyzed using Wilcoxon signed-rank tests and generalized linear mixed models.

Results: Surveys from 136 respondents were analyzed. Clinicians reported greater improvements in satisfaction with product B (2.51 points on a 7-point scale) compared to product A (1.91 points; mean difference: 0.60, 95% CI: 0.32-0.90). Both tools reduced personal and work burnout scores, but differences between tools were not meaningful. Product B demonstrated greater reductions in average minutes-in-notes per day compared to product A (B - A = -3.19 minutes; 95% CI -4.87 to -1.50). No meaningful differences were observed in pajama time or patient-related burnout.

Discussion: Both tools improved workflow satisfaction and reduced burnout, with product B showing superior performance in satisfaction and documentation time. However, efficiency metrics like pajama time were largely unaffected, potentially due to participant selection bias and the study period's timing.

Conclusion: Product B yielded greater satisfaction and time savings compared to product A, though both tools effectively reduced physician burnout and improved workflow satisfaction.

目的:本研究旨在通过随机交叉试验,比较两种环境人工智能记录技术在减少医生职业倦怠、提高工作流程满意度和提高记录效率方面的有效性。材料和方法:在一家三级学术医疗中心进行了一项开放标签随机交叉试验,涉及160名门诊临床医生。志愿者被随机分为两组,每组80人,有两个交叉期。我们评估了工作流程满意度(1-7级)、倦怠(哥本哈根倦怠指数)和效率指标(例如,计划时间以外的电子健康记录时间、文档时间等)。数据分析采用Wilcoxon符号秩检验和广义线性混合模型。结果:对136名受访者的调查进行了分析。临床医生报告产品B的满意度(2.51分,7分制)比产品a(1.91分,平均差值:0.60,95% CI: 0.32-0.90)有更大的改善。这两种工具都降低了个人和工作倦怠得分,但工具之间的差异没有意义。与产品A相比,产品B显示每天平均记录分钟数的减少幅度更大(B - A = -3.19分钟;95% CI为-4.87至-1.50)。在睡衣时间或患者相关的倦怠方面没有观察到有意义的差异。讨论:这两种工具都提高了工作流程的满意度,减少了工作倦怠,产品B在满意度和文档编制时间方面表现出了卓越的性能。然而,像睡衣时间这样的效率指标在很大程度上没有受到影响,这可能是由于参与者选择偏差和研究时期的时间。结论:与产品A相比,产品B产生了更高的满意度和时间节省,尽管这两种工具都有效地减少了医生的倦怠,提高了工作流程的满意度。
{"title":"Comparing ambient scribes: a randomized crossover clinical trial addressing ambient scribe technologies' impact on physician burnout.","authors":"Anand Chowdhury, Michele Casey, Jonathan Wilson, Kathryn I Pollak, Benjamin A Goldstein, Armando Bedoya, Eric G Poon","doi":"10.1093/jamia/ocag018","DOIUrl":"https://doi.org/10.1093/jamia/ocag018","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to compare the effectiveness of 2 ambient AI scribe technologies in reducing physician burnout, improving workflow satisfaction, and enhancing documentation efficiency through a randomized crossover trial.</p><p><strong>Materials and methods: </strong>An open-label randomized crossover trial involving 160 outpatient clinicians was conducted at a tertiary academic medical center. Volunteers were randomized to 2 groups of 80 with 2 crossover periods. We assessed workflow satisfaction (1-7 scale), burnout (Copenhagen Burnout Index), and efficiency metrics (eg, electronic health record time outside scheduled hours, documentation time, etc.). Data was analyzed using Wilcoxon signed-rank tests and generalized linear mixed models.</p><p><strong>Results: </strong>Surveys from 136 respondents were analyzed. Clinicians reported greater improvements in satisfaction with product B (2.51 points on a 7-point scale) compared to product A (1.91 points; mean difference: 0.60, 95% CI: 0.32-0.90). Both tools reduced personal and work burnout scores, but differences between tools were not meaningful. Product B demonstrated greater reductions in average minutes-in-notes per day compared to product A (B - A = -3.19 minutes; 95% CI -4.87 to -1.50). No meaningful differences were observed in pajama time or patient-related burnout.</p><p><strong>Discussion: </strong>Both tools improved workflow satisfaction and reduced burnout, with product B showing superior performance in satisfaction and documentation time. However, efficiency metrics like pajama time were largely unaffected, potentially due to participant selection bias and the study period's timing.</p><p><strong>Conclusion: </strong>Product B yielded greater satisfaction and time savings compared to product A, though both tools effectively reduced physician burnout and improved workflow satisfaction.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The subtleties of abolishing "race correction" in clinical artificial intelligence. 废除临床人工智能中“种族矫正”的微妙之处。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-04 DOI: 10.1093/jamia/ocag012
Moustafa Abdalla, LLana James, David S Jones, Mohamed Abdalla

Objectives: To explore the complexities of eliminating race correction in clinical artificial intelligence (AI), the pitfalls of naive solutions, and to propose systematic strategies for equitable model development.

Background and significance: Race correction in clinical AI, as in traditional medicine, introduces biases with potentially harmful consequences. Simple removal of race from models is insufficient due to the lasting influence of historically biased data.

Approach: We analyze 4 standardized scenarios to demonstrate how race correction manifests in clinical AI: use of race-corrected variables, explicit inclusion of race, inference via proxy variables, and use of race-specific models.

Results: For each scenario, the intuitive solution to removing race correction fails to eliminate bias, often due to legacy effects embedded in the data. More thoughtful approaches are required.

Discussion: Ending race correction in clinical AI requires deliberate, context-sensitive interventions, inclusion of diverse stakeholders, and strategies to make model reasoning more transparent and auditable.

目的:探讨消除临床人工智能(AI)中种族纠正的复杂性,幼稚解决方案的陷阱,并提出公平模型开发的系统策略。背景和意义:与传统医学一样,临床人工智能中的种族纠正会引入偏见,并带来潜在的有害后果。由于历史上有偏见的数据的持久影响,简单地从模型中删除种族是不够的。方法:我们分析了4种标准化场景,以展示种族纠正如何在临床人工智能中表现出来:使用种族纠正变量,明确包含种族,通过代理变量进行推理,以及使用种族特定模型。结果:对于每种情况,消除种族校正的直观解决方案都无法消除偏见,这通常是由于数据中嵌入的遗留效应。需要更深思熟虑的方法。讨论:在临床人工智能中结束种族纠正需要深思熟虑的、上下文敏感的干预措施,包括不同的利益相关者,以及使模型推理更加透明和可审计的策略。
{"title":"The subtleties of abolishing \"race correction\" in clinical artificial intelligence.","authors":"Moustafa Abdalla, LLana James, David S Jones, Mohamed Abdalla","doi":"10.1093/jamia/ocag012","DOIUrl":"https://doi.org/10.1093/jamia/ocag012","url":null,"abstract":"<p><strong>Objectives: </strong>To explore the complexities of eliminating race correction in clinical artificial intelligence (AI), the pitfalls of naive solutions, and to propose systematic strategies for equitable model development.</p><p><strong>Background and significance: </strong>Race correction in clinical AI, as in traditional medicine, introduces biases with potentially harmful consequences. Simple removal of race from models is insufficient due to the lasting influence of historically biased data.</p><p><strong>Approach: </strong>We analyze 4 standardized scenarios to demonstrate how race correction manifests in clinical AI: use of race-corrected variables, explicit inclusion of race, inference via proxy variables, and use of race-specific models.</p><p><strong>Results: </strong>For each scenario, the intuitive solution to removing race correction fails to eliminate bias, often due to legacy effects embedded in the data. More thoughtful approaches are required.</p><p><strong>Discussion: </strong>Ending race correction in clinical AI requires deliberate, context-sensitive interventions, inclusion of diverse stakeholders, and strategies to make model reasoning more transparent and auditable.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of BERT-based large language models for emergency department triage using real-world conversations. 开发基于bert的大型语言模型,使用真实世界的对话进行急诊科分类。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-04 DOI: 10.1093/jamia/ocag007
Sukyo Lee, Sumin Jung, Jong-Hak Park, Hanjin Cho, Sungwoo Moon, Sejoong Ahn

Objectives: Accurate triage in emergency departments (ED) is critical for appropriate resource allocation. While artificial intelligence (AI) has been explored for triage, prior models relied on summarized clinical scenarios. We aimed to develop and evaluate large language models (LLMs) trained on real-world clinical conversations to classify patient urgency.

Materials and methods: We used a nationally curated dataset of anonymized triage-level conversations from 3 tertiary Korean hospitals. Two BERT-based models were developed to classify urgency per the Korean Triage and Acuity Scale (KTAS) into urgent (KTAS 3) or non-urgent (KTAS 4-5). One model tokenized the entire conversation, while the other applied a hierarchical structure with sentence-level tokenization and speaker-role embeddings. Performance metrics included accuracy, precision, recall, and F1-score. We compared our models against ChatGPT GPT-4o and ClinicalBERT, and assessed explainability using SHapley Additive exPlanations (SHAP).

Results: A total of 5244 clinical conversations, 1057 triage-level dialogues were used, with 950 for training and 107 for testing. Our model with hierarchical structure achieved accuracies of 75.94%, significantly outperforming ChatGPT (56.68%) or fine-tuned ClinicalBERT (69.42%). For urgent cases, the best model achieved a recall of 0.9610, outperforming ChatGPT (0.5352). SHapley Additive exPlanations analysis confirmed that our model focused on clinically relevant cues aligned with KTAS criteria.

Conclusion: BERT-based LLMs trained on real-world ED conversations significantly outperform general-purpose models like ChatGPT in triage accuracy. This approach demonstrates the potential for enhancing clinical decision support with interpretable and efficient AI.

目的:在急诊科(ED)准确的分类是关键的适当的资源分配。虽然人工智能(AI)已经被用于分类,但之前的模型依赖于总结的临床场景。我们的目标是开发和评估大型语言模型(llm),这些模型训练于现实世界的临床对话,以对患者的紧急程度进行分类。材料和方法:我们使用了来自3家韩国三级医院的匿名分类级别对话的国家管理数据集。开发了两个基于bert的模型,根据韩国分诊和急性程度量表(KTAS)将紧急程度分为紧急(KTAS 3)或非紧急(KTAS 4-5)。一个模型将整个对话标记化,而另一个模型则应用具有句子级标记化和说话者角色嵌入的分层结构。性能指标包括准确性、精密度、召回率和f1分数。我们将我们的模型与ChatGPT gpt - 40和ClinicalBERT进行比较,并使用SHapley加法解释(SHAP)评估可解释性。结果:共使用5244个临床对话,1057个分诊级别对话,其中950个用于培训,107个用于测试。我们的分层结构模型达到了75.94%的准确率,显著优于ChatGPT(56.68%)或微调后的ClinicalBERT(69.42%)。对于紧急情况,最佳模型的召回率为0.9610,优于ChatGPT(0.5352)。SHapley加性解释分析证实,我们的模型专注于与KTAS标准一致的临床相关线索。结论:基于bert的llm在现实世界的ED对话中训练,在分类准确性方面明显优于ChatGPT等通用模型。这种方法展示了通过可解释和高效的人工智能增强临床决策支持的潜力。
{"title":"Development of BERT-based large language models for emergency department triage using real-world conversations.","authors":"Sukyo Lee, Sumin Jung, Jong-Hak Park, Hanjin Cho, Sungwoo Moon, Sejoong Ahn","doi":"10.1093/jamia/ocag007","DOIUrl":"https://doi.org/10.1093/jamia/ocag007","url":null,"abstract":"<p><strong>Objectives: </strong>Accurate triage in emergency departments (ED) is critical for appropriate resource allocation. While artificial intelligence (AI) has been explored for triage, prior models relied on summarized clinical scenarios. We aimed to develop and evaluate large language models (LLMs) trained on real-world clinical conversations to classify patient urgency.</p><p><strong>Materials and methods: </strong>We used a nationally curated dataset of anonymized triage-level conversations from 3 tertiary Korean hospitals. Two BERT-based models were developed to classify urgency per the Korean Triage and Acuity Scale (KTAS) into urgent (KTAS 3) or non-urgent (KTAS 4-5). One model tokenized the entire conversation, while the other applied a hierarchical structure with sentence-level tokenization and speaker-role embeddings. Performance metrics included accuracy, precision, recall, and F1-score. We compared our models against ChatGPT GPT-4o and ClinicalBERT, and assessed explainability using SHapley Additive exPlanations (SHAP).</p><p><strong>Results: </strong>A total of 5244 clinical conversations, 1057 triage-level dialogues were used, with 950 for training and 107 for testing. Our model with hierarchical structure achieved accuracies of 75.94%, significantly outperforming ChatGPT (56.68%) or fine-tuned ClinicalBERT (69.42%). For urgent cases, the best model achieved a recall of 0.9610, outperforming ChatGPT (0.5352). SHapley Additive exPlanations analysis confirmed that our model focused on clinically relevant cues aligned with KTAS criteria.</p><p><strong>Conclusion: </strong>BERT-based LLMs trained on real-world ED conversations significantly outperform general-purpose models like ChatGPT in triage accuracy. This approach demonstrates the potential for enhancing clinical decision support with interpretable and efficient AI.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human-Large Language Model collaboration for systematic ontology updates: a case study in the domain of dietary lifestyle. 用于系统本体更新的人-大语言模型协作:饮食生活方式领域的案例研究。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-04 DOI: 10.1093/jamia/ocag015
Jinsun Jung, Ricky Taira, Hyeoneui Kim

Objectives: To develop and evaluate a human-LLM (Large Language Model) collaborative approach for systematic ontology updating, demonstrated with the Dietary Lifestyle Ontology (DILON).

Materials and methods: One hundred dietary questionnaire items from English and Korean sources were semantically annotated by 4 state-of-the-art language models, which generated candidate concepts for inclusion into DILON. Outputs were refined through cross-model reconciliation, followed by expert review. The model curated the concept within DILON and experts reviewed and refined the outputs in Protégé to ensure accuracy and consistency.

Results: Claude Sonnet 4 effectively supported local tasks, including harvesting new concepts, detecting redundancies, and refining hierarchical segments. Global optimization of ontology, however, required systematic examination by human experts.

Discussion: These findings highlight the complementary strengths of LLMs and humans: LLMs accelerate repetitive and local updates, whereas humans maintain overall structural integrity.

Conclusion: Human-LLM collaboration improves efficiency, scalability, and sustainability in ontology engineering, supporting the maintenance of complex biomedical ontologies.

目的:开发和评估用于系统本体更新的人类- llm(大型语言模型)协作方法,并以饮食生活方式本体(DILON)为例。材料和方法:用4种最先进的语言模型对100份来自英语和韩语的膳食问卷进行语义标注,生成纳入DILON的候选概念。通过跨模型对账改进输出,然后进行专家审查。该模型在DILON内管理概念,专家审查和改进proprot中的产出,以确保准确性和一致性。结果:Claude Sonnet 4有效地支持本地任务,包括收获新概念、检测冗余和精炼分层分段。然而,本体的全局优化需要人类专家进行系统的检查。讨论:这些发现突出了llm和人类的互补优势:llm加速重复和局部更新,而人类保持整体结构的完整性。结论:人法学硕士协作提高了本体工程的效率、可扩展性和可持续性,支持复杂生物医学本体的维护。
{"title":"Human-Large Language Model collaboration for systematic ontology updates: a case study in the domain of dietary lifestyle.","authors":"Jinsun Jung, Ricky Taira, Hyeoneui Kim","doi":"10.1093/jamia/ocag015","DOIUrl":"https://doi.org/10.1093/jamia/ocag015","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and evaluate a human-LLM (Large Language Model) collaborative approach for systematic ontology updating, demonstrated with the Dietary Lifestyle Ontology (DILON).</p><p><strong>Materials and methods: </strong>One hundred dietary questionnaire items from English and Korean sources were semantically annotated by 4 state-of-the-art language models, which generated candidate concepts for inclusion into DILON. Outputs were refined through cross-model reconciliation, followed by expert review. The model curated the concept within DILON and experts reviewed and refined the outputs in Protégé to ensure accuracy and consistency.</p><p><strong>Results: </strong>Claude Sonnet 4 effectively supported local tasks, including harvesting new concepts, detecting redundancies, and refining hierarchical segments. Global optimization of ontology, however, required systematic examination by human experts.</p><p><strong>Discussion: </strong>These findings highlight the complementary strengths of LLMs and humans: LLMs accelerate repetitive and local updates, whereas humans maintain overall structural integrity.</p><p><strong>Conclusion: </strong>Human-LLM collaboration improves efficiency, scalability, and sustainability in ontology engineering, supporting the maintenance of complex biomedical ontologies.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146259773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating ambient artificial intelligence documentation: effects on work efficiency, documentation burden, and patient-centered care. 评估环境人工智能文档:对工作效率、文档负担和以患者为中心的护理的影响。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf180
Yawen Guo, Jiayuan Wang, Di Hu, Steven Tam, Charles Gilman, Emilie Chow, Danielle Perret, Deepti Pandita, Kai Zheng

Background and significance: Ambient listening tools powered by generative artificial intelligence (GenAI) offer real-time, scribe-like support that reduce documentation burden and may help alleviate burnout. This study assesses physician-perceived benefits and challenges of ambient AI implementation through surveys and evaluates its effectiveness in clinical workflows using automatically recorded electronic health record (EHR) time-efficiency metrics.

Method and materials: A quality improvement pilot has been underway at UCI Health since December 2023. Epic EHR Signal metrics were analyzed to assess changes in note length, documentation time, and same-day encounter closure rates. Matched pre- and post-implementation surveys evaluated physician-perceived changes in documentation burden, clinical efficiency, and care quality. We also examined open-ended survey responses using thematic analysis to supplement quantitative findings.

Results: Analysis on EHR usage data from 167 physicians showed significant reductions in note-writing time, despite an increase in note length. Survey responses (n = 65) also indicated statistically significant improvements across multiple domains. Physicians reported reduced cognitive demand (P = .031) and documentation effort (P = .014), alongside perceptions of enhanced clinical efficiency, patient-centered care, and EHR system usability. Thematic analysis confirmed these quantitative findings and identified opportunities for improvement, including specialty-specific customization and expanded AI functionality.

Discussion: Ambient AI tools demonstrated improved documentation efficiency, perceived care quality, and reduced cognitive workload. These benefits suggest potential to alleviate key burdens in clinical documentation.

Conclusion: Future development should prioritize customization for specialty-specific and individual physician needs, ensure the reliability and accuracy of AI-generated content, and integrate ethical and legal considerations to facilitate safe and scalable implementation in patient-centered care contexts.

背景和意义:由生成式人工智能(GenAI)驱动的环境监听工具提供实时的、类似抄写的支持,可以减轻文档负担,并有助于缓解倦怠。本研究通过调查评估了医生对环境人工智能实施的好处和挑战,并使用自动记录的电子健康记录(EHR)时间效率指标评估了其在临床工作流程中的有效性。方法和材料:自2023年12月以来,UCI健康中心一直在进行质量改进试点。对Epic EHR Signal指标进行分析,以评估记录长度、记录时间和当日就诊结束率的变化。匹配的实施前后调查评估了医生在文件负担、临床效率和护理质量方面的感知变化。我们还使用主题分析来检验开放式调查反馈,以补充定量结果。结果:对167名医生的电子病历使用数据的分析显示,尽管笔记长度增加,但写笔记的时间显著减少。调查回复(n = 65)也表明在多个领域有统计学上显著的改善。医生报告认知需求减少(P = 0.031),记录工作减少(P = 0.031)。014),以及对提高临床效率、以患者为中心的护理和电子病历系统可用性的看法。专题分析证实了这些定量的发现,并确定了改进的机会,包括特定的定制和扩展的AI功能。讨论:环境人工智能工具证明了提高文档效率、感知护理质量和减少认知工作量。这些益处表明有可能减轻临床文件中的关键负担。结论:未来的发展应优先考虑针对专科和个体医生需求的定制,确保人工智能生成内容的可靠性和准确性,并整合伦理和法律方面的考虑,以促进在以患者为中心的医疗环境中安全、可扩展地实施。
{"title":"Evaluating ambient artificial intelligence documentation: effects on work efficiency, documentation burden, and patient-centered care.","authors":"Yawen Guo, Jiayuan Wang, Di Hu, Steven Tam, Charles Gilman, Emilie Chow, Danielle Perret, Deepti Pandita, Kai Zheng","doi":"10.1093/jamia/ocaf180","DOIUrl":"10.1093/jamia/ocaf180","url":null,"abstract":"<p><strong>Background and significance: </strong>Ambient listening tools powered by generative artificial intelligence (GenAI) offer real-time, scribe-like support that reduce documentation burden and may help alleviate burnout. This study assesses physician-perceived benefits and challenges of ambient AI implementation through surveys and evaluates its effectiveness in clinical workflows using automatically recorded electronic health record (EHR) time-efficiency metrics.</p><p><strong>Method and materials: </strong>A quality improvement pilot has been underway at UCI Health since December 2023. Epic EHR Signal metrics were analyzed to assess changes in note length, documentation time, and same-day encounter closure rates. Matched pre- and post-implementation surveys evaluated physician-perceived changes in documentation burden, clinical efficiency, and care quality. We also examined open-ended survey responses using thematic analysis to supplement quantitative findings.</p><p><strong>Results: </strong>Analysis on EHR usage data from 167 physicians showed significant reductions in note-writing time, despite an increase in note length. Survey responses (n = 65) also indicated statistically significant improvements across multiple domains. Physicians reported reduced cognitive demand (P = .031) and documentation effort (P = .014), alongside perceptions of enhanced clinical efficiency, patient-centered care, and EHR system usability. Thematic analysis confirmed these quantitative findings and identified opportunities for improvement, including specialty-specific customization and expanded AI functionality.</p><p><strong>Discussion: </strong>Ambient AI tools demonstrated improved documentation efficiency, perceived care quality, and reduced cognitive workload. These benefits suggest potential to alleviate key burdens in clinical documentation.</p><p><strong>Conclusion: </strong>Future development should prioritize customization for specialty-specific and individual physician needs, ensure the reliability and accuracy of AI-generated content, and integrate ethical and legal considerations to facilitate safe and scalable implementation in patient-centered care contexts.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"273-282"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844569/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145304268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinician, patient, and organizational perspectives on ambient AI scribes. 临床医生、患者和组织对环境人工智能记录仪的看法。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf231
Suzanne Bakken
{"title":"Clinician, patient, and organizational perspectives on ambient AI scribes.","authors":"Suzanne Bakken","doi":"10.1093/jamia/ocaf231","DOIUrl":"10.1093/jamia/ocaf231","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"33 2","pages":"253-255"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146068393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gaps in artificial intelligence research for rural health in the United States: a scoping review. 美国农村卫生领域人工智能研究的差距:范围审查。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf206
Katherine E Brown, Sharon E Davis

Objective: Artificial intelligence (AI) has impacted healthcare at urban and academic medical centers in the US. There are concerns, however, that the promise of AI may not be realized in rural communities. This scoping review aims to determine the extent of AI research in the rural US.

Materials and methods: We conducted a scoping review following the PRISMA guidelines. We included peer-reviewed, original research studies indexed in PubMed, Embase, and WebOfScience after January 1, 2010 and through April 29, 2025. Studies were required to discuss the development, implementation, or evaluation of AI tools in rural US healthcare, including frameworks that help facilitate AI development (eg, data warehouses).

Results: Our search strategy found 26 studies meeting inclusion criteria after full text screening with 14 papers discussing predictive AI models and 12 papers discussing data or research infrastructure. AI models most often targeted resource allocation and distribution. Few studies explored model deployment and impact. Half noted the lack of data and analytic resources as a limitation. None of the studies discussed examples of generative AI being trained, evaluated, or deployed in a rural setting.

Discussion: Practical limitations may be influencing and limiting the types of AI models evaluated in the rural US. Validation of tools in the rural US was underwhelming.

Conclusion: With few studies moving beyond AI model design and development stages, there are clear gaps in our understanding of how to reliably validate, deploy, and sustain AI models in rural settings to advance health in all communities.

目的:人工智能(AI)已经影响了美国城市和学术医疗中心的医疗保健。然而,有人担心,人工智能的前景可能无法在农村社区实现。这一范围审查旨在确定人工智能研究在美国农村的程度。材料和方法:我们按照PRISMA指南进行了范围审查。我们收录了2010年1月1日至2025年4月29日期间在PubMed、Embase和WebOfScience中被索引的同行评议的原创研究。需要进行研究,讨论美国农村医疗保健中人工智能工具的开发、实施或评估,包括有助于促进人工智能开发的框架(例如,数据仓库)。结果:经过全文筛选,我们的搜索策略发现26篇研究符合纳入标准,其中14篇论文讨论预测人工智能模型,12篇论文讨论数据或研究基础设施。人工智能模型通常针对资源分配和分配。很少有研究探讨模型的部署和影响。一半的人指出缺乏数据和分析资源是一个限制。这些研究都没有讨论在农村环境中训练、评估或部署生成式人工智能的例子。讨论:实际限制可能会影响和限制在美国农村评估的人工智能模型的类型。在美国农村地区,对这些工具的验证并不令人印象深刻。结论:由于很少有研究超越了人工智能模型的设计和开发阶段,我们对如何在农村环境中可靠地验证、部署和维持人工智能模型以促进所有社区健康的理解存在明显差距。
{"title":"Gaps in artificial intelligence research for rural health in the United States: a scoping review.","authors":"Katherine E Brown, Sharon E Davis","doi":"10.1093/jamia/ocaf206","DOIUrl":"10.1093/jamia/ocaf206","url":null,"abstract":"<p><strong>Objective: </strong>Artificial intelligence (AI) has impacted healthcare at urban and academic medical centers in the US. There are concerns, however, that the promise of AI may not be realized in rural communities. This scoping review aims to determine the extent of AI research in the rural US.</p><p><strong>Materials and methods: </strong>We conducted a scoping review following the PRISMA guidelines. We included peer-reviewed, original research studies indexed in PubMed, Embase, and WebOfScience after January 1, 2010 and through April 29, 2025. Studies were required to discuss the development, implementation, or evaluation of AI tools in rural US healthcare, including frameworks that help facilitate AI development (eg, data warehouses).</p><p><strong>Results: </strong>Our search strategy found 26 studies meeting inclusion criteria after full text screening with 14 papers discussing predictive AI models and 12 papers discussing data or research infrastructure. AI models most often targeted resource allocation and distribution. Few studies explored model deployment and impact. Half noted the lack of data and analytic resources as a limitation. None of the studies discussed examples of generative AI being trained, evaluated, or deployed in a rural setting.</p><p><strong>Discussion: </strong>Practical limitations may be influencing and limiting the types of AI models evaluated in the rural US. Validation of tools in the rural US was underwhelming.</p><p><strong>Conclusion: </strong>With few studies moving beyond AI model design and development stages, there are clear gaps in our understanding of how to reliably validate, deploy, and sustain AI models in rural settings to advance health in all communities.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"509-520"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844595/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145598168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1