使用大型语言模型从急诊科笔记中提取核心损伤信息。

IF 3 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL Journal of Korean Medical Science Pub Date : 2024-12-02 DOI:10.3346/jkms.2024.39.e291
Dong Hyun Choi, Yoonjic Kim, Sae Won Choi, Ki Hong Kim, Yeongho Choi, Sang Do Shin
{"title":"使用大型语言模型从急诊科笔记中提取核心损伤信息。","authors":"Dong Hyun Choi, Yoonjic Kim, Sae Won Choi, Ki Hong Kim, Yeongho Choi, Sang Do Shin","doi":"10.3346/jkms.2024.39.e291","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Injuries pose a significant global health challenge due to their high incidence and mortality rates. Although injury surveillance is essential for prevention, it is resource-intensive. This study aimed to develop and validate locally deployable large language models (LLMs) to extract core injury-related information from Emergency Department (ED) clinical notes.</p><p><strong>Methods: </strong>We conducted a diagnostic study using retrospectively collected data from January 2014 to December 2020 from two urban academic tertiary hospitals. One served as the derivation cohort and the other as the external test cohort. Adult patients presenting to the ED with injury-related complaints were included. Primary outcomes included classification accuracies for information extraction tasks related to injury mechanism, place of occurrence, activity, intent, and severity. We fine-tuned a single generalizable Llama-2 model and five distinct Bidirectional Encoder Representations from Transformers (BERT) models for each task to extract information from initial ED physician notes. The Llama-2 model was able to perform different tasks by modifying the instruction prompt. Data recorded in injury registries provided the gold standard labels. Model performance was assessed using accuracy and macro-average F1 scores.</p><p><strong>Results: </strong>The derivation and external test cohorts comprised 36,346 and 32,232 patients, respectively. In the derivation cohort's test set, the Llama-2 model achieved accuracies (95% confidence intervals) of 0.899 (0.889-0.909) for injury mechanism, 0.774 (0.760-0.789) for place of occurrence, 0.679 (0.665-0.694) for activity, 0.972 (0.967-0.977) for intent, and 0.935 (0.926-0.943) for severity. The Llama-2 model outperformed the BERT models in accuracy and macro-average F1 scores across all tasks in both cohorts. Imposing constraints on the Llama-2 model to avoid uncertain predictions further improved its accuracy.</p><p><strong>Conclusion: </strong>Locally deployable LLMs, trained to extract core injury-related information from free-text ED clinical notes, demonstrated good performance. Generative LLMs can serve as versatile solutions for various injury-related information extraction tasks.</p>","PeriodicalId":16249,"journal":{"name":"Journal of Korean Medical Science","volume":"39 46","pages":"e291"},"PeriodicalIF":3.0000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611659/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using Large Language Models to Extract Core Injury Information From Emergency Department Notes.\",\"authors\":\"Dong Hyun Choi, Yoonjic Kim, Sae Won Choi, Ki Hong Kim, Yeongho Choi, Sang Do Shin\",\"doi\":\"10.3346/jkms.2024.39.e291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Injuries pose a significant global health challenge due to their high incidence and mortality rates. Although injury surveillance is essential for prevention, it is resource-intensive. This study aimed to develop and validate locally deployable large language models (LLMs) to extract core injury-related information from Emergency Department (ED) clinical notes.</p><p><strong>Methods: </strong>We conducted a diagnostic study using retrospectively collected data from January 2014 to December 2020 from two urban academic tertiary hospitals. One served as the derivation cohort and the other as the external test cohort. Adult patients presenting to the ED with injury-related complaints were included. Primary outcomes included classification accuracies for information extraction tasks related to injury mechanism, place of occurrence, activity, intent, and severity. We fine-tuned a single generalizable Llama-2 model and five distinct Bidirectional Encoder Representations from Transformers (BERT) models for each task to extract information from initial ED physician notes. The Llama-2 model was able to perform different tasks by modifying the instruction prompt. Data recorded in injury registries provided the gold standard labels. Model performance was assessed using accuracy and macro-average F1 scores.</p><p><strong>Results: </strong>The derivation and external test cohorts comprised 36,346 and 32,232 patients, respectively. In the derivation cohort's test set, the Llama-2 model achieved accuracies (95% confidence intervals) of 0.899 (0.889-0.909) for injury mechanism, 0.774 (0.760-0.789) for place of occurrence, 0.679 (0.665-0.694) for activity, 0.972 (0.967-0.977) for intent, and 0.935 (0.926-0.943) for severity. The Llama-2 model outperformed the BERT models in accuracy and macro-average F1 scores across all tasks in both cohorts. Imposing constraints on the Llama-2 model to avoid uncertain predictions further improved its accuracy.</p><p><strong>Conclusion: </strong>Locally deployable LLMs, trained to extract core injury-related information from free-text ED clinical notes, demonstrated good performance. Generative LLMs can serve as versatile solutions for various injury-related information extraction tasks.</p>\",\"PeriodicalId\":16249,\"journal\":{\"name\":\"Journal of Korean Medical Science\",\"volume\":\"39 46\",\"pages\":\"e291\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611659/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Korean Medical Science\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3346/jkms.2024.39.e291\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Korean Medical Science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3346/jkms.2024.39.e291","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:伤害因其高发病率和高死亡率而对全球健康构成重大挑战。尽管伤害监测对预防至关重要,但它是资源密集型的。本研究旨在开发和验证本地可部署的大型语言模型(llm),以从急诊科(ED)临床记录中提取核心损伤相关信息。方法:回顾性收集两所城市三级医院2014年1月至2020年12月的资料进行诊断研究。一个作为衍生队列,另一个作为外部测试队列。向急诊科提出与伤害有关的投诉的成年患者也包括在内。主要结果包括与损伤机制、发生地点、活动、意图和严重程度相关的信息提取任务的分类准确性。我们为每个任务微调了单个可推广的Llama-2模型和来自变形器(BERT)模型的五个不同的双向编码器表示,以从初始ED医生笔记中提取信息。lama-2模型可以通过修改指令提示符来执行不同的任务。记录在伤害登记处的数据提供了金标准标签。使用准确性和宏观平均F1分数评估模型性能。结果:衍生和外部试验队列分别包括36,346例和32,232例患者。在衍生队列的测试集中,Llama-2模型对伤害机制的准确度(95%置信区间)为0.899(0.889-0.909),对发生地点的准确度为0.774(0.760-0.789),对活动的准确度为0.679(0.665-0.694),对意图的准确度为0.972(0.967-0.977),对严重程度的准确度为0.935(0.926-0.943)。lama-2模型在准确率和宏观平均F1分数上都优于BERT模型。对羊驼-2模型施加约束以避免不确定的预测,进一步提高了其准确性。结论:局部部署的llm,经过训练,从自由文本ED临床记录中提取核心损伤相关信息,表现出良好的性能。生成式llm可以作为各种伤害相关信息提取任务的通用解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using Large Language Models to Extract Core Injury Information From Emergency Department Notes.

Background: Injuries pose a significant global health challenge due to their high incidence and mortality rates. Although injury surveillance is essential for prevention, it is resource-intensive. This study aimed to develop and validate locally deployable large language models (LLMs) to extract core injury-related information from Emergency Department (ED) clinical notes.

Methods: We conducted a diagnostic study using retrospectively collected data from January 2014 to December 2020 from two urban academic tertiary hospitals. One served as the derivation cohort and the other as the external test cohort. Adult patients presenting to the ED with injury-related complaints were included. Primary outcomes included classification accuracies for information extraction tasks related to injury mechanism, place of occurrence, activity, intent, and severity. We fine-tuned a single generalizable Llama-2 model and five distinct Bidirectional Encoder Representations from Transformers (BERT) models for each task to extract information from initial ED physician notes. The Llama-2 model was able to perform different tasks by modifying the instruction prompt. Data recorded in injury registries provided the gold standard labels. Model performance was assessed using accuracy and macro-average F1 scores.

Results: The derivation and external test cohorts comprised 36,346 and 32,232 patients, respectively. In the derivation cohort's test set, the Llama-2 model achieved accuracies (95% confidence intervals) of 0.899 (0.889-0.909) for injury mechanism, 0.774 (0.760-0.789) for place of occurrence, 0.679 (0.665-0.694) for activity, 0.972 (0.967-0.977) for intent, and 0.935 (0.926-0.943) for severity. The Llama-2 model outperformed the BERT models in accuracy and macro-average F1 scores across all tasks in both cohorts. Imposing constraints on the Llama-2 model to avoid uncertain predictions further improved its accuracy.

Conclusion: Locally deployable LLMs, trained to extract core injury-related information from free-text ED clinical notes, demonstrated good performance. Generative LLMs can serve as versatile solutions for various injury-related information extraction tasks.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Korean Medical Science
Journal of Korean Medical Science 医学-医学:内科
CiteScore
7.80
自引率
8.90%
发文量
320
审稿时长
3-6 weeks
期刊介绍: The Journal of Korean Medical Science (JKMS) is an international, peer-reviewed Open Access journal of medicine published weekly in English. The Journal’s publisher is the Korean Academy of Medical Sciences (KAMS), Korean Medical Association (KMA). JKMS aims to publish evidence-based, scientific research articles from various disciplines of the medical sciences. The Journal welcomes articles of general interest to medical researchers especially when they contain original information. Articles on the clinical evaluation of drugs and other therapies, epidemiologic studies of the general population, studies on pathogenic organisms and toxic materials, and the toxicities and adverse effects of therapeutics are welcome.
期刊最新文献
A Longitudinal Increase in Serum Gamma-Glutamyl Transferase Levels, but Not in Alanine Aminotransferase Levels, Improves the Prediction of Risk of Impaired Fasting Glucose in Male. Cost Utility Analysis of National Cancer Screening Program for Gastric Cancer in Korea: A Markov Model Analysis. Identification of Mutations of the RYR2 in Sudden Infant Death Syndrome. Physicians' Collective Actions in Response to Government Health Policies: A Scoping Review. In This Issue on 17-February-2025.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1