Evaluating the Accuracy of ChatGPT in Common Patient Questions Regarding HPV+ Oropharyngeal Carcinoma.

IF 1.2 4区医学 Q3 OTORHINOLARYNGOLOGY Annals of Otology Rhinology and Laryngology Pub Date : 2024-09-01 Epub Date: 2024-07-29 DOI:10.1177/00034894241259137

Nikhil Bellamkonda, Janice L Farlow, Catherine T Haring, Michael W Sim, Nolan B Seim, Richard B Cannon, Marcus M Monroe, Amit Agrawal, James W Rocco, Hilary C McCrary

{"title":"Evaluating the Accuracy of ChatGPT in Common Patient Questions Regarding HPV+ Oropharyngeal Carcinoma.","authors":"Nikhil Bellamkonda, Janice L Farlow, Catherine T Haring, Michael W Sim, Nolan B Seim, Richard B Cannon, Marcus M Monroe, Amit Agrawal, James W Rocco, Hilary C McCrary","doi":"10.1177/00034894241259137","DOIUrl":null,"url":null,"abstract":"Objectives: Large language model (LLM)-based chatbots such as ChatGPT have been publicly available and increasingly utilized by the general public since late 2022. This study sought to investigate ChatGPT responses to common patient questions regarding Human Papilloma Virus (HPV) positive oropharyngeal cancer (OPC).Methods: This was a prospective, multi-institutional study, with data collected from high volume institutions that perform >50 transoral robotic surgery cases per year. The 100 most recent discussion threads including the term \"HPV\" on the American Cancer Society's Cancer Survivors Network's Head and Neck Cancer public discussion board were reviewed. The 11 most common questions were serially queried to ChatGPT 3.5; answers were recorded. A survey was distributed to fellowship trained head and neck oncologic surgeons at 3 institutions to evaluate the responses.Results: A total of 8 surgeons participated in the study. For questions regarding HPV contraction and transmission, ChatGPT answers were scored as clinically accurate and aligned with consensus in the head and neck surgical oncology community 84.4% and 90.6% of the time, respectively. For questions involving treatment of HPV+ OPC, ChatGPT was clinically accurate and aligned with consensus 87.5% and 91.7% of the time, respectively. For questions regarding the HPV vaccine, ChatGPT was clinically accurate and aligned with consensus 62.5% and 75% of the time, respectively. When asked about circulating tumor DNA testing, only 12.5% of surgeons thought responses were accurate or consistent with consensus.Conclusion: ChatGPT 3.5 performed poorly with questions involving evolving therapies and diagnostics-thus, caution should be used when using a platform like ChatGPT 3.5 to assess use of advanced technology. Patients should be counseled on the importance of consulting their surgeons to receive accurate and up to date recommendations, and use LLM's to augment their understanding of these important health-related topics.","PeriodicalId":50975,"journal":{"name":"Annals of Otology Rhinology and Laryngology","volume":" ","pages":"814-819"},"PeriodicalIF":1.2000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Otology Rhinology and Laryngology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/00034894241259137","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/29 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Large language model (LLM)-based chatbots such as ChatGPT have been publicly available and increasingly utilized by the general public since late 2022. This study sought to investigate ChatGPT responses to common patient questions regarding Human Papilloma Virus (HPV) positive oropharyngeal cancer (OPC).

Methods: This was a prospective, multi-institutional study, with data collected from high volume institutions that perform >50 transoral robotic surgery cases per year. The 100 most recent discussion threads including the term "HPV" on the American Cancer Society's Cancer Survivors Network's Head and Neck Cancer public discussion board were reviewed. The 11 most common questions were serially queried to ChatGPT 3.5; answers were recorded. A survey was distributed to fellowship trained head and neck oncologic surgeons at 3 institutions to evaluate the responses.

Results: A total of 8 surgeons participated in the study. For questions regarding HPV contraction and transmission, ChatGPT answers were scored as clinically accurate and aligned with consensus in the head and neck surgical oncology community 84.4% and 90.6% of the time, respectively. For questions involving treatment of HPV+ OPC, ChatGPT was clinically accurate and aligned with consensus 87.5% and 91.7% of the time, respectively. For questions regarding the HPV vaccine, ChatGPT was clinically accurate and aligned with consensus 62.5% and 75% of the time, respectively. When asked about circulating tumor DNA testing, only 12.5% of surgeons thought responses were accurate or consistent with consensus.

Conclusion: ChatGPT 3.5 performed poorly with questions involving evolving therapies and diagnostics-thus, caution should be used when using a platform like ChatGPT 3.5 to assess use of advanced technology. Patients should be counseled on the importance of consulting their surgeons to receive accurate and up to date recommendations, and use LLM's to augment their understanding of these important health-related topics.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

评估 ChatGPT 在常见患者关于 HPV+ 口咽癌问题中的准确性。

目的：基于大语言模型（LLM）的聊天机器人，如 ChatGPT，自 2022 年末以来已经公开可用，并越来越多地被公众使用。本研究旨在调查 ChatGPT 对患者关于人乳头状瘤病毒（HPV）阳性口咽癌（OPC）常见问题的回答：这是一项前瞻性、多机构研究，数据收集自每年经口机器人手术例数大于 50 例的大医院。研究人员查阅了美国癌症协会癌症幸存者网络头颈癌公共讨论板上包含 "HPV "一词的100条最新讨论主题。在 ChatGPT 3.5 中连续查询了 11 个最常见的问题，并记录了答案。向 3 家机构受过研究培训的头颈部肿瘤外科医生发放了调查问卷，以评估回复情况：共有 8 名外科医生参与了这项研究。对于有关 HPV 感染和传播的问题，ChatGPT 的答案被评为临床准确，并分别有 84.4% 和 90.6% 的时间与头颈部肿瘤外科界的共识一致。对于涉及 HPV+ OPC 治疗的问题，ChatGPT 的临床准确性和符合共识的比例分别为 87.5% 和 91.7%。对于有关 HPV 疫苗的问题，ChatGPT 的临床准确性和与共识一致的比例分别为 62.5% 和 75%。当被问及循环肿瘤 DNA 检测时，只有 12.5% 的外科医生认为回答准确或符合共识：ChatGPT 3.5 在涉及不断发展的疗法和诊断的问题上表现不佳，因此在使用 ChatGPT 3.5 这样的平台评估先进技术的使用情况时应谨慎。应建议患者咨询他们的外科医生，以获得准确和最新的建议，并使用 LLM 增强他们对这些重要健康相关主题的了解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Annals of Otology Rhinology and Laryngology 医学-耳鼻喉科学

CiteScore

3.10

自引率

7.10%

发文量

171

审稿时长

4-8 weeks

期刊介绍： The Annals of Otology, Rhinology & Laryngology publishes original manuscripts of clinical and research importance in otolaryngology–head and neck medicine and surgery, otology, neurotology, bronchoesophagology, laryngology, rhinology, head and neck oncology and surgery, plastic and reconstructive surgery, pediatric otolaryngology, audiology, and speech pathology. In-depth studies (supplements), papers of historical interest, and reviews of computer software and applications in otolaryngology are also published, as well as imaging, pathology, and clinicopathology studies, book reviews, and letters to the editor. AOR is the official journal of the American Broncho-Esophagological Association.