Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals

IF 1.8 Q3 PHARMACOLOGY & PHARMACY Exploratory research in clinical and social pharmacy Pub Date : 2024-06-13 DOI:10.1016/j.rcsop.2024.100464
Merel van Nuland , Anne-Fleur H. Lobbezoo , Ewoudt M.W. van de Garde , Maikel Herbrink , Inger van Heijl , Tim Bognàr , Jeroen P.A. Houwen , Marloes Dekens , Demi Wannet , Toine Egberts , Paul D. van der Linden
{"title":"Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals","authors":"Merel van Nuland ,&nbsp;Anne-Fleur H. Lobbezoo ,&nbsp;Ewoudt M.W. van de Garde ,&nbsp;Maikel Herbrink ,&nbsp;Inger van Heijl ,&nbsp;Tim Bognàr ,&nbsp;Jeroen P.A. Houwen ,&nbsp;Marloes Dekens ,&nbsp;Demi Wannet ,&nbsp;Toine Egberts ,&nbsp;Paul D. van der Linden","doi":"10.1016/j.rcsop.2024.100464","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>The advent of Large Language Models (LLMs) such as ChatGPT introduces opportunities within the medical field. Nonetheless, use of LLM poses a risk when healthcare practitioners and patients present clinical questions to these programs without a comprehensive understanding of its suitability for clinical contexts.</p></div><div><h3>Objective</h3><p>The objective of this study was to assess ChatGPT's ability to generate appropriate responses to clinical questions that hospital pharmacists could encounter during routine patient care.</p></div><div><h3>Methods</h3><p>Thirty questions from 10 different domains within clinical pharmacy were collected during routine care. Questions were presented to ChatGPT in a standardized format, including patients' age, sex, drug name, dose, and indication. Subsequently, relevant information regarding specific cases were provided, and the prompt was concluded with the query “what would a hospital pharmacist do?”. The impact on accuracy was assessed for each domain by modifying personification to “what would you do?”, presenting the question in Dutch, and regenerating the primary question. All responses were independently evaluated by two senior hospital pharmacists, focusing on the availability of an advice, accuracy and concordance.</p></div><div><h3>Results</h3><p>In 77% of questions, ChatGPT provided an advice in response to the question. For these responses, accuracy and concordance were determined. Accuracy was correct and complete for 26% of responses, correct but incomplete for 22% of responses, partially correct and partially incorrect for 30% of responses and completely incorrect for 22% of responses. The reproducibility was poor, with merely 10% of responses remaining consistent upon regeneration of the primary question.</p></div><div><h3>Conclusions</h3><p>While concordance of responses was excellent, the accuracy and reproducibility were poor. With the described method, ChatGPT should not be used to address questions encountered by hospital pharmacists during their shifts. However, it is important to acknowledge the limitations of our methodology, including potential biases, which may have influenced the findings.</p></div>","PeriodicalId":73003,"journal":{"name":"Exploratory research in clinical and social pharmacy","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667276624000611/pdfft?md5=7dba765dfd1e9f2fac71ba4ccdc63981&pid=1-s2.0-S2667276624000611-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Exploratory research in clinical and social pharmacy","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667276624000611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

The advent of Large Language Models (LLMs) such as ChatGPT introduces opportunities within the medical field. Nonetheless, use of LLM poses a risk when healthcare practitioners and patients present clinical questions to these programs without a comprehensive understanding of its suitability for clinical contexts.

Objective

The objective of this study was to assess ChatGPT's ability to generate appropriate responses to clinical questions that hospital pharmacists could encounter during routine patient care.

Methods

Thirty questions from 10 different domains within clinical pharmacy were collected during routine care. Questions were presented to ChatGPT in a standardized format, including patients' age, sex, drug name, dose, and indication. Subsequently, relevant information regarding specific cases were provided, and the prompt was concluded with the query “what would a hospital pharmacist do?”. The impact on accuracy was assessed for each domain by modifying personification to “what would you do?”, presenting the question in Dutch, and regenerating the primary question. All responses were independently evaluated by two senior hospital pharmacists, focusing on the availability of an advice, accuracy and concordance.

Results

In 77% of questions, ChatGPT provided an advice in response to the question. For these responses, accuracy and concordance were determined. Accuracy was correct and complete for 26% of responses, correct but incomplete for 22% of responses, partially correct and partially incorrect for 30% of responses and completely incorrect for 22% of responses. The reproducibility was poor, with merely 10% of responses remaining consistent upon regeneration of the primary question.

Conclusions

While concordance of responses was excellent, the accuracy and reproducibility were poor. With the described method, ChatGPT should not be used to address questions encountered by hospital pharmacists during their shifts. However, it is important to acknowledge the limitations of our methodology, including potential biases, which may have influenced the findings.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估 ChatGPT 回答医院日常药物护理问题的准确性
背景大语言模型(LLM)(如 ChatGPT)的出现为医疗领域带来了机遇。本研究的目的是评估 ChatGPT 对医院药剂师在日常患者护理过程中可能遇到的临床问题生成适当回复的能力。方法在日常护理过程中收集了来自临床药学 10 个不同领域的 30 个问题。问题以标准化格式呈现给 ChatGPT,包括患者的年龄、性别、药物名称、剂量和适应症。随后,提供具体病例的相关信息,并以 "医院药剂师会怎么做?"的询问结束提示。通过将拟人化修改为 "您会怎么做?"、用荷兰语提出问题并重新生成主问题,对每个领域的准确性影响进行了评估。所有回答均由两名资深医院药剂师进行独立评估,重点关注建议的可用性、准确性和一致性。结果在 77% 的问题中,ChatGPT 针对问题提供了建议。对于这些回答,确定了准确性和一致性。准确性方面,26% 的回答正确且完整,22% 的回答正确但不完整,30% 的回答部分正确和部分不正确,22% 的回答完全不正确。重现性很差,只有 10% 的回答在主问题重新生成后保持一致。根据所描述的方法,ChatGPT 不应用于解决医院药剂师在工作中遇到的问题。但是,必须承认我们的方法存在局限性,包括可能影响研究结果的潜在偏见。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
审稿时长
103 days
期刊最新文献
Montelukast deprescribing in outpatient specialty clinics: A single center cross-sectional study Appropriateness of direct oral anticoagulant dosing in patients with atrial fibrillation at a tertiary care hospital in Thailand Comparing nursing medication rounds before and after implementation of automated dispensing cabinets: A time and motion study Translation, transcultural adaptation, and validation of the Brazilian Portuguese version of the general medication adherence scale (GMAS) in patients with high blood pressure A cross-sectional survey exploring organizational readiness to implement community pharmacy-based opioid counseling and naloxone services in rural versus urban settings in Alabama
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1