Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance

IF 1.5 4区医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Clinical Imaging Pub Date : 2024-09-06 DOI:10.1016/j.clinimag.2024.110276

Daniel Nguyen , Allison MacKenzie , Young H. Kim

{"title":"Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance","authors":"Daniel Nguyen , Allison MacKenzie , Young H. Kim","doi":"10.1016/j.clinimag.2024.110276","DOIUrl":null,"url":null,"abstract":"<div><p>Large Language Models (LLM) like ChatGPT-4 hold significant promise in medical application, especially in the field of radiology. While previous studies have shown the promise of ChatGTP-4 in textual-based scenarios, its performance on image-based response remains suboptimal. This study investigates the impact of prompt engineering on ChatGPT-4's accuracy on the 2022 American College of Radiology In Training Test Questions for Diagnostic Radiology Residents that include textual and visual-based questions. Four personas were created, each with unique prompts, and evaluated using ChatGPT-4. Results indicate that encouraging prompts and those disclaiming responsibility led to higher overall accuracy (number of questions answered correctly) compared to other personas. Personas that threaten the LLM with legal action or mounting clinical responsibility were not only found to score less, but also refrain of answering questions at a higher rate. These findings highlight the importance of prompt context in optimizing LLM responses and the need for further research to integrate AI responsibly into medical practice.</p></div>","PeriodicalId":50680,"journal":{"name":"Clinical Imaging","volume":"115 ","pages":"Article 110276"},"PeriodicalIF":1.5000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0899707124002067","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Large Language Models (LLM) like ChatGPT-4 hold significant promise in medical application, especially in the field of radiology. While previous studies have shown the promise of ChatGTP-4 in textual-based scenarios, its performance on image-based response remains suboptimal. This study investigates the impact of prompt engineering on ChatGPT-4's accuracy on the 2022 American College of Radiology In Training Test Questions for Diagnostic Radiology Residents that include textual and visual-based questions. Four personas were created, each with unique prompts, and evaluated using ChatGPT-4. Results indicate that encouraging prompts and those disclaiming responsibility led to higher overall accuracy (number of questions answered correctly) compared to other personas. Personas that threaten the LLM with legal action or mounting clinical responsibility were not only found to score less, but also refrain of answering questions at a higher rate. These findings highlight the importance of prompt context in optimizing LLM responses and the need for further research to integrate AI responsibly into medical practice.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

鼓励与责任：提示工程如何影响 ChatGPT-4 的放射学考试成绩

像 ChatGPT-4 这样的大型语言模型（LLM）在医疗应用中大有可为，尤其是在放射学领域。虽然之前的研究表明 ChatGPT-4 在基于文本的场景中大有可为，但它在基于图像的应答中的表现仍不尽如人意。本研究调查了提示工程对 ChatGPT-4 在 2022 年美国放射学会放射诊断住院医师培训测试题中准确性的影响，该测试题包括文本和基于图像的问题。我们创建了四个角色，每个角色都有独特的提示，并使用 ChatGPT-4 进行了评估。结果表明，与其他角色相比，鼓励性提示和免责提示的总体准确率（答对问题的数量）更高。以法律诉讼或增加临床责任来威胁法学硕士的角色不仅得分较低，而且不回答问题的比例也较高。这些发现强调了提示性语境在优化 LLM 回答中的重要性，以及进一步研究将人工智能负责任地融入医疗实践的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Clinical Imaging 医学-核医学

CiteScore

4.60

自引率

0.00%

发文量

265

审稿时长

35 days

期刊介绍： The mission of Clinical Imaging is to publish, in a timely manner, the very best radiology research from the United States and around the world with special attention to the impact of medical imaging on patient care. The journal''s publications cover all imaging modalities, radiology issues related to patients, policy and practice improvements, and clinically-oriented imaging physics and informatics. The journal is a valuable resource for practicing radiologists, radiologists-in-training and other clinicians with an interest in imaging. Papers are carefully peer-reviewed and selected by our experienced subject editors who are leading experts spanning the range of imaging sub-specialties, which include: -Body Imaging- Breast Imaging- Cardiothoracic Imaging- Imaging Physics and Informatics- Molecular Imaging and Nuclear Medicine- Musculoskeletal and Emergency Imaging- Neuroradiology- Practice, Policy & Education- Pediatric Imaging- Vascular and Interventional Radiology