{"title":"Generating Synthetic Healthcare Dialogues in Emergency Medicine Using Large Language Models.","authors":"Denis Moser, Matthias Bender, Murat Sariyar","doi":"10.3233/SHTI241099","DOIUrl":null,"url":null,"abstract":"<p><p>Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM \"Zephyr-7b-beta\" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with \"Zephyr-7b-beta,\" slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"321 ","pages":"235-239"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI241099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM "Zephyr-7b-beta" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with "Zephyr-7b-beta," slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.