The role of large language models in self-care: a study and benchmark on medicines and supplement guidance accuracy.

IF 2.6 4区 医学 Q2 PHARMACOLOGY & PHARMACY International Journal of Clinical Pharmacy Pub Date : 2024-12-07 DOI:10.1007/s11096-024-01839-2
Branco De Busser, Lynn Roth, Hans De Loof
{"title":"The role of large language models in self-care: a study and benchmark on medicines and supplement guidance accuracy.","authors":"Branco De Busser, Lynn Roth, Hans De Loof","doi":"10.1007/s11096-024-01839-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The recent surge in the capabilities of artificial intelligence systems, particularly large language models, is also impacting the medical and pharmaceutical field in a major way. Beyond specialized uses in diagnostics and data discovery, these tools have now become accessible to the general public.</p><p><strong>Aim: </strong>The study aimed to critically analyse the current performance of large language models in answering patient's self-care questions regarding medications and supplements.</p><p><strong>Method: </strong>Answers from six major language models were analysed for correctness, language-independence, context-sensitivity, and reproducibility using a newly developed reference set of questions and a scoring matrix.</p><p><strong>Results: </strong>The investigated large language models are capable of answering a clear majority of self-care questions accurately, providing relevant health information. However, substantial variability in the responses, including potentially unsafe advice, was observed, influenced by language, question structure, user context and time. GPT 4.0 scored highest on average, while GPT 3.5, Gemini, and Gemini Advanced had varied scores. Responses were context and language sensitive. In terms of consistency over time, Perplexity had the worst performance.</p><p><strong>Conclusion: </strong>Given the high-quality output of large language models, their potential in self-care applications is undeniable. The newly created benchmark can facilitate further validation and guide the establishment of strict safeguards to combat the sizable risk of misinformation in order to reach a more favourable risk/benefit ratio when this cutting-edge technology is used by patients.</p>","PeriodicalId":13828,"journal":{"name":"International Journal of Clinical Pharmacy","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Clinical Pharmacy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11096-024-01839-2","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The recent surge in the capabilities of artificial intelligence systems, particularly large language models, is also impacting the medical and pharmaceutical field in a major way. Beyond specialized uses in diagnostics and data discovery, these tools have now become accessible to the general public.

Aim: The study aimed to critically analyse the current performance of large language models in answering patient's self-care questions regarding medications and supplements.

Method: Answers from six major language models were analysed for correctness, language-independence, context-sensitivity, and reproducibility using a newly developed reference set of questions and a scoring matrix.

Results: The investigated large language models are capable of answering a clear majority of self-care questions accurately, providing relevant health information. However, substantial variability in the responses, including potentially unsafe advice, was observed, influenced by language, question structure, user context and time. GPT 4.0 scored highest on average, while GPT 3.5, Gemini, and Gemini Advanced had varied scores. Responses were context and language sensitive. In terms of consistency over time, Perplexity had the worst performance.

Conclusion: Given the high-quality output of large language models, their potential in self-care applications is undeniable. The newly created benchmark can facilitate further validation and guide the establishment of strict safeguards to combat the sizable risk of misinformation in order to reach a more favourable risk/benefit ratio when this cutting-edge technology is used by patients.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大语言模型在自我保健中的作用:对药物和补充指导准确性的研究和基准。
背景:最近人工智能系统,特别是大型语言模型的能力激增,也对医疗和制药领域产生了重大影响。除了诊断和数据发现方面的专门用途外,这些工具现在已向公众开放。目的:本研究旨在批判性地分析目前大型语言模型在回答患者关于药物和补充剂的自我保健问题方面的表现。方法:使用新开发的参考问题集和评分矩阵,分析来自六个主要语言模型的答案的正确性、语言独立性、上下文敏感性和可重复性。结果:所研究的大型语言模型能够准确回答绝大多数自我保健问题,并提供相关的健康信息。然而,由于受到语言、问题结构、用户语境和时间的影响,在回答中发现了很大的差异,包括可能不安全的建议。GPT 4.0的平均得分最高,而GPT 3.5、Gemini和Gemini Advanced的得分各不相同。回答是上下文和语言敏感的。就时间的一致性而言,Perplexity的表现最差。结论:鉴于大型语言模型的高质量输出,其在自我保健应用中的潜力是不可否认的。新创建的基准可以促进进一步验证,并指导建立严格的保障措施,以打击相当大的错误信息风险,以便在患者使用这项尖端技术时达到更有利的风险/效益比。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.10
自引率
8.30%
发文量
131
审稿时长
4-8 weeks
期刊介绍: The International Journal of Clinical Pharmacy (IJCP) offers a platform for articles on research in Clinical Pharmacy, Pharmaceutical Care and related practice-oriented subjects in the pharmaceutical sciences. IJCP is a bi-monthly, international, peer-reviewed journal that publishes original research data, new ideas and discussions on pharmacotherapy and outcome research, clinical pharmacy, pharmacoepidemiology, pharmacoeconomics, the clinical use of medicines, medical devices and laboratory tests, information on medicines and medical devices information, pharmacy services research, medication management, other clinical aspects of pharmacy. IJCP publishes original Research articles, Review articles , Short research reports, Commentaries, book reviews, and Letters to the Editor. International Journal of Clinical Pharmacy is affiliated with the European Society of Clinical Pharmacy (ESCP). ESCP promotes practice and research in Clinical Pharmacy, especially in Europe. The general aim of the society is to advance education, practice and research in Clinical Pharmacy . Until 2010 the journal was called Pharmacy World & Science.
期刊最新文献
Community pharmacists improving equitable access to contraceptive methods: a commentary. Bacillus coagulans TBC169 probiotics for intestinal function recovery after gynecological open surgery: a randomized, double-blind, placebo-controlled trial. Patient and hospital staff perspectives on introducing pharmacist-led medication reviews at an orthopedic ward: a mixed methods pilot study. Medication-induced causes of delirium in patients with and without dementia: a systematic review of published neurology guidelines. Defining polypharmacy in older adults: a cross-sectional comparison of prevalence estimates calculated according to active ingredient and unique product counts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1