利用 RAG 和 GPT-4 从临床笔记中提取药物使用信息。

Studies in health technology and informatics Pub Date : 2024-11-22 DOI:10.3233/SHTI241070

Fatemeh Shah-Mohammadi, Joseph Finkelstein

{"title":"利用 RAG 和 GPT-4 从临床笔记中提取药物使用信息。","authors":"Fatemeh Shah-Mohammadi, Joseph Finkelstein","doi":"10.3233/SHTI241070","DOIUrl":null,"url":null,"abstract":"This research investigates the application of a hybrid Retrieval-Augmented Generation (RAG) and Generative Pre-trained Transformer (GPT) pipeline for extracting and categorizing substance use information from unstructured clinical notes. The aim is to enhance the accuracy and efficiency of identifying substance use mentions and determining their status in patient documentation. By integrating RAG to pre-filter and focus the input for GPT, the pipeline strategically narrows the scope of analysis to the most relevant text segments, thereby improving the precision and recall of the extraction. Utilizing the Medical Information Mart for Intensive Care III dataset, the performance of the pipeline was evaluated through manual verification, assessing various metrics including recall, precision, F1-score, and accuracy. The results demonstrated high precision rates (up to 0.99 for drug and alcohol mentions), and substantial recall (0.88 across all substances for status of the usage).","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"321 ","pages":"94-98"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Utilizing RAG and GPT-4 for Extraction of Substance Use Information from Clinical Notes.\",\"authors\":\"Fatemeh Shah-Mohammadi, Joseph Finkelstein\",\"doi\":\"10.3233/SHTI241070\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research investigates the application of a hybrid Retrieval-Augmented Generation (RAG) and Generative Pre-trained Transformer (GPT) pipeline for extracting and categorizing substance use information from unstructured clinical notes. The aim is to enhance the accuracy and efficiency of identifying substance use mentions and determining their status in patient documentation. By integrating RAG to pre-filter and focus the input for GPT, the pipeline strategically narrows the scope of analysis to the most relevant text segments, thereby improving the precision and recall of the extraction. Utilizing the Medical Information Mart for Intensive Care III dataset, the performance of the pipeline was evaluated through manual verification, assessing various metrics including recall, precision, F1-score, and accuracy. The results demonstrated high precision rates (up to 0.99 for drug and alcohol mentions), and substantial recall (0.88 across all substances for status of the usage).\",\"PeriodicalId\":94357,\"journal\":{\"name\":\"Studies in health technology and informatics\",\"volume\":\"321 \",\"pages\":\"94-98\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Studies in health technology and informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/SHTI241070\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI241070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究调查了混合检索-增强生成（RAG）和生成预训练转换器（GPT）管道在从非结构化临床笔记中提取和分类药物使用信息方面的应用。其目的是提高识别药物使用提及并确定其在患者文档中的状态的准确性和效率。通过整合 RAG 对 GPT 的输入进行预过滤和聚焦，该管道战略性地将分析范围缩小到最相关的文本片段，从而提高了提取的精确度和召回率。利用重症监护医疗信息市场 III 数据集，通过人工验证评估了该管道的性能，评估指标包括召回率、精确度、F1 分数和准确率。结果表明，精确率很高（药物和酒精提及率高达 0.99），召回率也很高（所有物质的使用状态召回率均为 0.88）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Utilizing RAG and GPT-4 for Extraction of Substance Use Information from Clinical Notes.

This research investigates the application of a hybrid Retrieval-Augmented Generation (RAG) and Generative Pre-trained Transformer (GPT) pipeline for extracting and categorizing substance use information from unstructured clinical notes. The aim is to enhance the accuracy and efficiency of identifying substance use mentions and determining their status in patient documentation. By integrating RAG to pre-filter and focus the input for GPT, the pipeline strategically narrows the scope of analysis to the most relevant text segments, thereby improving the precision and recall of the extraction. Utilizing the Medical Information Mart for Intensive Care III dataset, the performance of the pipeline was evaluated through manual verification, assessing various metrics including recall, precision, F1-score, and accuracy. The results demonstrated high precision rates (up to 0.99 for drug and alcohol mentions), and substantial recall (0.88 across all substances for status of the usage).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Studies in health technology and informatics

自引率

0.00%

发文量