Prompt engineering to increase GPT3.5’s performance on the Plastic Surgery In-Service Exams

{"title":"Prompt engineering to increase GPT3.5’s performance on the Plastic Surgery In-Service Exams","authors":"","doi":"10.1016/j.bjps.2024.09.001","DOIUrl":null,"url":null,"abstract":"<div><p>This study assesses ChatGPT's (GPT-3.5) performance on the 2021 ASPS Plastic Surgery In-Service Examination using prompt modifications and Retrieval Augmented Generation (RAG). ChatGPT was instructed to act as a \"resident,\" \"attending,\" or \"medical student,\" and RAG utilized a curated vector database for context. Results showed no significant improvement, with the \"resident\" prompt yielding the highest accuracy at 54%, and RAG failing to enhance performance, with accuracy remaining at 54.3%. Despite appropriate reasoning when correct, ChatGPT's overall performance fell in the 10th percentile, indicating the need for fine-tuning and more sophisticated approaches to improve AI's utility in complex medical tasks.</p></div>","PeriodicalId":50084,"journal":{"name":"Journal of Plastic Reconstructive and Aesthetic Surgery","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Plastic Reconstructive and Aesthetic Surgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1748681524005503","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

Abstract

This study assesses ChatGPT's (GPT-3.5) performance on the 2021 ASPS Plastic Surgery In-Service Examination using prompt modifications and Retrieval Augmented Generation (RAG). ChatGPT was instructed to act as a "resident," "attending," or "medical student," and RAG utilized a curated vector database for context. Results showed no significant improvement, with the "resident" prompt yielding the highest accuracy at 54%, and RAG failing to enhance performance, with accuracy remaining at 54.3%. Despite appropriate reasoning when correct, ChatGPT's overall performance fell in the 10th percentile, indicating the need for fine-tuning and more sophisticated approaches to improve AI's utility in complex medical tasks.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
及时实施工程,提高 GPT3.5 在整形外科在职考试中的成绩
本研究评估了 ChatGPT(GPT-3.5)在 2021 年 ASPS 整形外科在职考试中使用提示修改和检索增强生成(RAG)的表现。ChatGPT 被要求扮演 "住院医师"、"主治医师 "或 "医学生",而 RAG 则利用了一个精心策划的向量数据库作为语境。结果显示,"住院医师 "提示的准确率最高,为 54%,而 RAG 则未能提高准确率,准确率仍为 54.3%。尽管推理正确,但 ChatGPT 的整体性能仍处于第 10 百分位,这表明需要进行微调和采用更复杂的方法来提高人工智能在复杂医疗任务中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.10
自引率
11.10%
发文量
578
审稿时长
3.5 months
期刊介绍: JPRAS An International Journal of Surgical Reconstruction is one of the world''s leading international journals, covering all the reconstructive and aesthetic aspects of plastic surgery. The journal presents the latest surgical procedures with audit and outcome studies of new and established techniques in plastic surgery including: cleft lip and palate and other heads and neck surgery, hand surgery, lower limb trauma, burns, skin cancer, breast surgery and aesthetic surgery.
期刊最新文献
Trends in advanced practice providers in plastic and reconstructive surgery, 2013–2021 Editorial Board Long-term results and patient-reported outcomes after vascularized fibular graft use in the treatment of post-traumatic bone defects of femur shaft and tibia: A retrospective cohort and cross-sectional survey study Taps, wicks, bridges and LIFTs: Clarification on the origins of lymphatic flaps The curious case of medical advisor: The house of cards in aesthetic medicine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1