Generating Synthetic Healthcare Dialogues in Emergency Medicine Using Large Language Models.

Studies in health technology and informatics Pub Date : 2024-11-22 DOI:10.3233/SHTI241099

Denis Moser, Matthias Bender, Murat Sariyar

{"title":"Generating Synthetic Healthcare Dialogues in Emergency Medicine Using Large Language Models.","authors":"Denis Moser, Matthias Bender, Murat Sariyar","doi":"10.3233/SHTI241099","DOIUrl":null,"url":null,"abstract":"<p><p>Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM \"Zephyr-7b-beta\" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with \"Zephyr-7b-beta,\" slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"321 ","pages":"235-239"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI241099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM "Zephyr-7b-beta" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with "Zephyr-7b-beta," slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用大型语言模型生成急诊医学中的合成医疗对话。

自然语言处理（NLP）已在放射学等领域显示出将非结构化数据转换为结构化数据的前景，但获取合适的数据集却面临着一些挑战，其中包括隐私问题。具体来说，我们的目标是利用大型语言模型（LLMs）从救护人员和病人之间的对话中提取医疗信息，以填充紧急协议表格。然而，我们目前缺乏已知内容的对话，无法作为评估的黄金标准。我们设计了一个管道，使用量化 LLM "Zephyr-7b-beta "进行初始对话生成，然后使用 OpenAI 的 GPT-4 Turbo 进行细化和翻译。MIMIC-IV 数据库提供了相关的医疗数据。评估包括通过检索增强生成（RAG）进行准确性评估，以及使用多语言模型进行情感分析。初步结果显示，"Zephyr-7b-beta "的准确率高达 94%，在使用 GPT-4 Turbo 进行改进后，准确率略有下降，为 87%。情感分析表明，经过改进后，情感发生了质的变化，变得更加积极。这些发现凸显了使用 LLM 生成合成医疗对话的潜力和挑战，为未来医疗保健领域的 NLP 系统开发提供了参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Studies in health technology and informatics

自引率

0.00%

发文量