GPT-4o 用于自动语音转文本 CT 和 MRI 报告转录的多语言可行性。

IF 3.2 3区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING European Journal of Radiology Pub Date : 2024-11-17 DOI:10.1016/j.ejrad.2024.111827
Felix Busch, Philipp Prucker, Alexander Komenda, Sebastian Ziegelmayer, Marcus R Makowski, Keno K Bressem, Lisa C Adams
{"title":"GPT-4o 用于自动语音转文本 CT 和 MRI 报告转录的多语言可行性。","authors":"Felix Busch, Philipp Prucker, Alexander Komenda, Sebastian Ziegelmayer, Marcus R Makowski, Keno K Bressem, Lisa C Adams","doi":"10.1016/j.ejrad.2024.111827","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Large language models (LLMs) promise to streamline radiology reporting. With the release of OpenAI's GPT-4o (Generative Pre-trained Transformers-4 omni), which processes not only text but also speech, multimodal LLMs might now also be used as medical speech recognition software for radiology reporting in multiple languages. This proof-of-concept study investigates the feasibility of using GPT-4o for automated voice-to-text transcription of radiology reports in English and German.</p><p><strong>Methods: </strong>Three readers with varying levels of experience each dictated 100 synthetic radiology reports in both languages using GPT-4o via the ChatGPT iOS mobile application. Reports included CT and MRI scans of various anatomical regions. Evaluation metrics included error type, severity, and correction time. BERTScore and ROUGE metrics were calculated to assess semantic similarity and n-gram overlap between dictated and original reports.</p><p><strong>Results: </strong>No significant differences in correction time between languages were found, but differences were observed between readers based on experience. Error rates were similar for both languages, with most errors being minor (92.68 %, n = 114/123 German; 94.74 %, n = 90/95 English) and technical (27.04 %, n = 43/159 German; 35.65 %, n = 41/115 English) or typographical (23.9 %, n = 38/159 German; 27.83 %, n = 32/115 English). BERTScore metrics were significantly higher for German, while ROUGE metrics showed no significant differences between languages.</p><p><strong>Conclusion: </strong>This study demonstrates the potential of GPT-4o for multilingual transcription of radiology reports, effectively handling both English and German with minimal errors and high semantic understanding. Future research should compare GPT-4o with current radiology dictation tools, assessing performance, cost-effectiveness, and multilingual capabilities across diverse speaker populations.</p>","PeriodicalId":12063,"journal":{"name":"European Journal of Radiology","volume":"182 ","pages":"111827"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilingual feasibility of GPT-4o for automated Voice-to-Text CT and MRI report transcription.\",\"authors\":\"Felix Busch, Philipp Prucker, Alexander Komenda, Sebastian Ziegelmayer, Marcus R Makowski, Keno K Bressem, Lisa C Adams\",\"doi\":\"10.1016/j.ejrad.2024.111827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Large language models (LLMs) promise to streamline radiology reporting. With the release of OpenAI's GPT-4o (Generative Pre-trained Transformers-4 omni), which processes not only text but also speech, multimodal LLMs might now also be used as medical speech recognition software for radiology reporting in multiple languages. This proof-of-concept study investigates the feasibility of using GPT-4o for automated voice-to-text transcription of radiology reports in English and German.</p><p><strong>Methods: </strong>Three readers with varying levels of experience each dictated 100 synthetic radiology reports in both languages using GPT-4o via the ChatGPT iOS mobile application. Reports included CT and MRI scans of various anatomical regions. Evaluation metrics included error type, severity, and correction time. BERTScore and ROUGE metrics were calculated to assess semantic similarity and n-gram overlap between dictated and original reports.</p><p><strong>Results: </strong>No significant differences in correction time between languages were found, but differences were observed between readers based on experience. Error rates were similar for both languages, with most errors being minor (92.68 %, n = 114/123 German; 94.74 %, n = 90/95 English) and technical (27.04 %, n = 43/159 German; 35.65 %, n = 41/115 English) or typographical (23.9 %, n = 38/159 German; 27.83 %, n = 32/115 English). BERTScore metrics were significantly higher for German, while ROUGE metrics showed no significant differences between languages.</p><p><strong>Conclusion: </strong>This study demonstrates the potential of GPT-4o for multilingual transcription of radiology reports, effectively handling both English and German with minimal errors and high semantic understanding. Future research should compare GPT-4o with current radiology dictation tools, assessing performance, cost-effectiveness, and multilingual capabilities across diverse speaker populations.</p>\",\"PeriodicalId\":12063,\"journal\":{\"name\":\"European Journal of Radiology\",\"volume\":\"182 \",\"pages\":\"111827\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ejrad.2024.111827\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ejrad.2024.111827","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

目的:大型语言模型(LLM)有望简化放射学报告。OpenAI 的 GPT-4o(Generative Pre-trained Transformers-4 omni)不仅能处理文本,还能处理语音,随着它的发布,多模态 LLM 现在也可用作医学语音识别软件,以多种语言处理放射学报告。这项概念验证研究调查了使用 GPT-4o 自动将英语和德语的放射学报告语音转为文本的可行性:方法:三名具有不同经验水平的读者通过 ChatGPT iOS 移动应用程序使用 GPT-4o 分别口述了 100 份两种语言的合成放射学报告。报告包括不同解剖区域的 CT 和 MRI 扫描。评估指标包括错误类型、严重程度和纠正时间。通过计算 BERTScore 和 ROUGE 指标来评估听写报告和原始报告之间的语义相似性和 n-gram 重叠:结果:没有发现不同语言在校正时间上有明显差异,但根据经验观察到不同读者之间存在差异。两种语言的错误率相似,大多数错误为轻微错误(92.68%,n=114/123 德语;94.74%,n=90/95 英语)、技术错误(27.04%,n=43/159 德语;35.65%,n=41/115 英语)或排版错误(23.9%,n=38/159 德语;27.83%,n=32/115 英语)。德语的 BERTScore 指标明显更高,而 ROUGE 指标在不同语言之间没有明显差异:本研究证明了 GPT-4o 在放射学报告多语言转录方面的潜力,它能有效处理英语和德语,错误极少,语义理解能力强。未来的研究应将 GPT-4o 与当前的放射学听写工具进行比较,评估其性能、成本效益以及在不同说话人群中的多语言能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multilingual feasibility of GPT-4o for automated Voice-to-Text CT and MRI report transcription.

Purpose: Large language models (LLMs) promise to streamline radiology reporting. With the release of OpenAI's GPT-4o (Generative Pre-trained Transformers-4 omni), which processes not only text but also speech, multimodal LLMs might now also be used as medical speech recognition software for radiology reporting in multiple languages. This proof-of-concept study investigates the feasibility of using GPT-4o for automated voice-to-text transcription of radiology reports in English and German.

Methods: Three readers with varying levels of experience each dictated 100 synthetic radiology reports in both languages using GPT-4o via the ChatGPT iOS mobile application. Reports included CT and MRI scans of various anatomical regions. Evaluation metrics included error type, severity, and correction time. BERTScore and ROUGE metrics were calculated to assess semantic similarity and n-gram overlap between dictated and original reports.

Results: No significant differences in correction time between languages were found, but differences were observed between readers based on experience. Error rates were similar for both languages, with most errors being minor (92.68 %, n = 114/123 German; 94.74 %, n = 90/95 English) and technical (27.04 %, n = 43/159 German; 35.65 %, n = 41/115 English) or typographical (23.9 %, n = 38/159 German; 27.83 %, n = 32/115 English). BERTScore metrics were significantly higher for German, while ROUGE metrics showed no significant differences between languages.

Conclusion: This study demonstrates the potential of GPT-4o for multilingual transcription of radiology reports, effectively handling both English and German with minimal errors and high semantic understanding. Future research should compare GPT-4o with current radiology dictation tools, assessing performance, cost-effectiveness, and multilingual capabilities across diverse speaker populations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.70
自引率
3.00%
发文量
398
审稿时长
42 days
期刊介绍: European Journal of Radiology is an international journal which aims to communicate to its readers, state-of-the-art information on imaging developments in the form of high quality original research articles and timely reviews on current developments in the field. Its audience includes clinicians at all levels of training including radiology trainees, newly qualified imaging specialists and the experienced radiologist. Its aim is to inform efficient, appropriate and evidence-based imaging practice to the benefit of patients worldwide.
期刊最新文献
Multilingual feasibility of GPT-4o for automated Voice-to-Text CT and MRI report transcription. Predicting functional outcome after open lumbar fusion surgery: A retrospective multicenter cohort study ECG, clinical and novel CT-imaging predictors of necessary pacemaker implantation after transfemoral aortic valve replacement In-vivo cerebral artery pulsation assessment with Dynamic computed tomography angiography Diagnostic performance of Photon-counting CT angiography in peripheral artery disease compared to DSA as gold standard
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1