Large Language Model (LLM)-Powered Chatbots Fail to Generate Guideline-Consistent Content on Resuscitation and May Provide Potentially Harmful Advice.

IF 2.1 4区 医学 Q2 EMERGENCY MEDICINE Prehospital and Disaster Medicine Pub Date : 2023-12-01 Epub Date: 2023-11-06 DOI:10.1017/S1049023X23006568
Alexei A Birkun, Adhish Gautam
{"title":"Large Language Model (LLM)-Powered Chatbots Fail to Generate Guideline-Consistent Content on Resuscitation and May Provide Potentially Harmful Advice.","authors":"Alexei A Birkun, Adhish Gautam","doi":"10.1017/S1049023X23006568","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Innovative large language model (LLM)-powered chatbots, which are extremely popular nowadays, represent potential sources of information on resuscitation for the general public. For instance, the chatbot-generated advice could be used for purposes of community resuscitation education or for just-in-time informational support of untrained lay rescuers in a real-life emergency.</p><p><strong>Study objective: </strong>This study focused on assessing performance of two prominent LLM-based chatbots, particularly in terms of quality of the chatbot-generated advice on how to give help to a non-breathing victim.</p><p><strong>Methods: </strong>In May 2023, the new Bing (Microsoft Corporation, USA) and Bard (Google LLC, USA) chatbots were inquired (<i>n</i> = 20 each): \"What to do if someone is not breathing?\" Content of the chatbots' responses was evaluated for compliance with the 2021 Resuscitation Council United Kingdom guidelines using a pre-developed checklist.</p><p><strong>Results: </strong>Both chatbots provided context-dependent textual responses to the query. However, coverage of the guideline-consistent instructions on help to a non-breathing victim within the responses was poor: mean percentage of the responses completely satisfying the checklist criteria was 9.5% for Bing and 11.4% for Bard (<i>P</i> >.05). Essential elements of the bystander action, including early start and uninterrupted performance of chest compressions with adequate depth, rate, and chest recoil, as well as request for and use of an automated external defibrillator (AED), were missing as a rule. Moreover, 55.0% of Bard's responses contained plausible sounding, but nonsensical guidance, called artificial hallucinations, that create risk for inadequate care and harm to a victim.</p><p><strong>Conclusion: </strong>The LLM-powered chatbots' advice on help to a non-breathing victim omits essential details of resuscitation technique and occasionally contains deceptive, potentially harmful directives. Further research and regulatory measures are required to mitigate risks related to the chatbot-generated misinformation of public on resuscitation.</p>","PeriodicalId":20400,"journal":{"name":"Prehospital and Disaster Medicine","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Prehospital and Disaster Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1017/S1049023X23006568","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/6 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Innovative large language model (LLM)-powered chatbots, which are extremely popular nowadays, represent potential sources of information on resuscitation for the general public. For instance, the chatbot-generated advice could be used for purposes of community resuscitation education or for just-in-time informational support of untrained lay rescuers in a real-life emergency.

Study objective: This study focused on assessing performance of two prominent LLM-based chatbots, particularly in terms of quality of the chatbot-generated advice on how to give help to a non-breathing victim.

Methods: In May 2023, the new Bing (Microsoft Corporation, USA) and Bard (Google LLC, USA) chatbots were inquired (n = 20 each): "What to do if someone is not breathing?" Content of the chatbots' responses was evaluated for compliance with the 2021 Resuscitation Council United Kingdom guidelines using a pre-developed checklist.

Results: Both chatbots provided context-dependent textual responses to the query. However, coverage of the guideline-consistent instructions on help to a non-breathing victim within the responses was poor: mean percentage of the responses completely satisfying the checklist criteria was 9.5% for Bing and 11.4% for Bard (P >.05). Essential elements of the bystander action, including early start and uninterrupted performance of chest compressions with adequate depth, rate, and chest recoil, as well as request for and use of an automated external defibrillator (AED), were missing as a rule. Moreover, 55.0% of Bard's responses contained plausible sounding, but nonsensical guidance, called artificial hallucinations, that create risk for inadequate care and harm to a victim.

Conclusion: The LLM-powered chatbots' advice on help to a non-breathing victim omits essential details of resuscitation technique and occasionally contains deceptive, potentially harmful directives. Further research and regulatory measures are required to mitigate risks related to the chatbot-generated misinformation of public on resuscitation.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大型语言模型(LLM)支持的聊天机器人无法生成复苏指南一致的内容,可能会提供潜在的有害建议。
引言:创新的大型语言模型(LLM)驱动的聊天机器人如今非常流行,为公众提供了复苏方面的潜在信息来源。例如,聊天机器人生成的建议可以用于社区复苏教育,也可以用于在现实紧急情况下为未经培训的非专业救援人员提供即时信息支持。研究目标:本研究重点评估了两个著名的基于LLM的聊天机器人的性能,特别是在聊天机器人生成的关于如何帮助没有呼吸的受害者的建议的质量方面。方法:2023年5月,对新的必应(美国微软公司)和巴德(美国谷歌有限责任公司)聊天机器人进行了询问(各20人):“如果有人没有呼吸该怎么办?”使用预先制定的检查表,评估了聊天机器人的回复内容是否符合2021年英国复苏委员会的指导方针。结果:两个聊天机器人都为查询提供了依赖上下文的文本响应。然而,指南中关于帮助无呼吸受害者的一致性说明在反应中的覆盖率很低:Bing和Bard完全满足检查表标准的反应的平均百分比分别为9.5%和11.4%(P>0.05),胸部反冲,以及对自动体外除颤器(AED)的要求和使用,通常都不见了。此外,巴德55.0%的回答包含听起来合理但荒谬的指导,称为人为幻觉,这会给受害者带来护理不足和伤害的风险。结论:LLM提供的聊天机器人关于帮助没有呼吸的受害者的建议省略了复苏技术的基本细节,偶尔还会包含欺骗性的、潜在有害的指令。需要进一步的研究和监管措施来减轻与聊天机器人产生的公众复苏错误信息有关的风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Prehospital and Disaster Medicine
Prehospital and Disaster Medicine Medicine-Emergency Medicine
CiteScore
3.10
自引率
13.60%
发文量
279
期刊介绍: Prehospital and Disaster Medicine (PDM) is an official publication of the World Association for Disaster and Emergency Medicine. Currently in its 25th volume, Prehospital and Disaster Medicine is one of the leading scientific journals focusing on prehospital and disaster health. It is the only peer-reviewed international journal in its field, published bi-monthly, providing a readable, usable worldwide source of research and analysis. PDM is currently distributed in more than 55 countries. Its readership includes physicians, professors, EMTs and paramedics, nurses, emergency managers, disaster planners, hospital administrators, sociologists, and psychologists.
期刊最新文献
Prehospital Care Under Fire: Strategies for Evacuating Victims from the Mega Terrorist Attack in Israel on October 7, 2023 Challenges and Clinical Impact of Medical Search and Rescue Efforts Following the Kahramanmaraş Earthquake. Integrating Disaster and Dignitary Medicine Principles into a Medical Framework for Organizational Travel Health and Security Planning. Applications and Performance of Machine Learning Algorithms in Emergency Medical Services: A Scoping Review. Rapid Ultrasonography for Shock and Hypotension Protocol Performed using Handheld Ultrasound Devices by Paramedics in a Moving Ambulance: Evaluation of Image Accuracy and Time in Motion.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1