聊天机器人作为面部整形美容手术的患者教育资源:对 ChatGPT 和 Google Bard 响应的评估。

IF 1.6 3区 医学 Q2 SURGERY Facial Plastic Surgery & Aesthetic Medicine Pub Date : 2024-11-01 Epub Date: 2024-07-01 DOI:10.1089/fpsam.2023.0368
Neha Garg, Daniel J Campbell, Angela Yang, Adam McCann, Annie E Moroco, Leonard E Estephan, William J Palmer, Howard Krein, Ryan Heffelfinger
{"title":"聊天机器人作为面部整形美容手术的患者教育资源:对 ChatGPT 和 Google Bard 响应的评估。","authors":"Neha Garg, Daniel J Campbell, Angela Yang, Adam McCann, Annie E Moroco, Leonard E Estephan, William J Palmer, Howard Krein, Ryan Heffelfinger","doi":"10.1089/fpsam.2023.0368","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> ChatGPT and Google Bard™ are popular artificial intelligence chatbots with utility for patients, including those undergoing aesthetic facial plastic surgery. <b>Objective:</b> To compare the accuracy and readability of chatbot-generated responses to patient education questions regarding aesthetic facial plastic surgery using a response accuracy scale and readability testing. <b>Method:</b> ChatGPT and Google Bard™ were asked 28 identical questions using four prompts: none, patient friendly, eighth-grade level, and references. Accuracy was assessed using Global Quality Scale (range: 1-5). Flesch-Kincaid grade level was calculated, and chatbot-provided references were analyzed for veracity. <b>Results:</b> Although 59.8% of responses were good quality (Global Quality Scale ≥4), ChatGPT generated more accurate responses than Google Bard™ on patient-friendly prompting (<i>p</i> < 0.001). Google Bard™ responses were of a significantly lower grade level than ChatGPT for all prompts (<i>p</i> < 0.05). Despite eighth-grade prompting, response grade level for both chatbots was high: ChatGPT (10.5 ± 1.8) and Google Bard™ (9.6 ± 1.3). Prompting for references yielded 108/108 of chatbot-generated references. Forty-one (38.0%) citations were legitimate. Twenty (18.5%) provided accurately reported information from the reference. <b>Conclusion:</b> Although ChatGPT produced more accurate responses and at a higher education level than Google Bard™, both chatbots provided responses above recommended grade levels for patients and failed to provide accurate references.</p>","PeriodicalId":48487,"journal":{"name":"Facial Plastic Surgery & Aesthetic Medicine","volume":" ","pages":"665-673"},"PeriodicalIF":1.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chatbots as Patient Education Resources for Aesthetic Facial Plastic Surgery: Evaluation of ChatGPT and Google Bard Responses.\",\"authors\":\"Neha Garg, Daniel J Campbell, Angela Yang, Adam McCann, Annie E Moroco, Leonard E Estephan, William J Palmer, Howard Krein, Ryan Heffelfinger\",\"doi\":\"10.1089/fpsam.2023.0368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background:</b> ChatGPT and Google Bard™ are popular artificial intelligence chatbots with utility for patients, including those undergoing aesthetic facial plastic surgery. <b>Objective:</b> To compare the accuracy and readability of chatbot-generated responses to patient education questions regarding aesthetic facial plastic surgery using a response accuracy scale and readability testing. <b>Method:</b> ChatGPT and Google Bard™ were asked 28 identical questions using four prompts: none, patient friendly, eighth-grade level, and references. Accuracy was assessed using Global Quality Scale (range: 1-5). Flesch-Kincaid grade level was calculated, and chatbot-provided references were analyzed for veracity. <b>Results:</b> Although 59.8% of responses were good quality (Global Quality Scale ≥4), ChatGPT generated more accurate responses than Google Bard™ on patient-friendly prompting (<i>p</i> < 0.001). Google Bard™ responses were of a significantly lower grade level than ChatGPT for all prompts (<i>p</i> < 0.05). Despite eighth-grade prompting, response grade level for both chatbots was high: ChatGPT (10.5 ± 1.8) and Google Bard™ (9.6 ± 1.3). Prompting for references yielded 108/108 of chatbot-generated references. Forty-one (38.0%) citations were legitimate. Twenty (18.5%) provided accurately reported information from the reference. <b>Conclusion:</b> Although ChatGPT produced more accurate responses and at a higher education level than Google Bard™, both chatbots provided responses above recommended grade levels for patients and failed to provide accurate references.</p>\",\"PeriodicalId\":48487,\"journal\":{\"name\":\"Facial Plastic Surgery & Aesthetic Medicine\",\"volume\":\" \",\"pages\":\"665-673\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Facial Plastic Surgery & Aesthetic Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1089/fpsam.2023.0368\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Facial Plastic Surgery & Aesthetic Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/fpsam.2023.0368","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

背景介绍ChatGPT 和 Google Bard™ 是广受欢迎的人工智能聊天机器人,对患者(包括接受面部美容整形手术的患者)很有用。目的使用回复准确性量表和可读性测试比较聊天机器人生成的有关面部美容整形手术的患者教育问题回复的准确性和可读性。方法使用四种提示向 ChatGPT 和 Google Bard™ 提出了 28 个相同的问题:无、患者友好、八年级水平和参考文献。准确性采用全球质量量表(范围:1-5)进行评估。计算了 Flesch-Kincaid 分级,并分析了聊天机器人提供的参考资料的真实性。结果虽然 59.8% 的回复质量良好(全局质量量表≥4),但在患者友好提示方面,ChatGPT 生成的回复比 Google Bard™ 更准确(p < 0.001)。在所有提示中,Google Bard™ 的回答水平明显低于 ChatGPT(p < 0.05)。尽管有八年级的提示,但两个聊天机器人的回复等级都很高:ChatGPT (10.5 ± 1.8) 和 Google Bard™ (9.6 ± 1.3)。在聊天机器人生成的参考文献中,提示参考文献的比例为 108/108。41条(38.0%)引用是合法的。20条(18.5%)提供了准确的参考文献信息。结论:虽然 ChatGPT 比 Google Bard™ 生成的回复更准确,教育程度也更高,但这两个聊天机器人提供的回复都高于建议的患者年级水平,并且未能提供准确的参考文献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Chatbots as Patient Education Resources for Aesthetic Facial Plastic Surgery: Evaluation of ChatGPT and Google Bard Responses.

Background: ChatGPT and Google Bard™ are popular artificial intelligence chatbots with utility for patients, including those undergoing aesthetic facial plastic surgery. Objective: To compare the accuracy and readability of chatbot-generated responses to patient education questions regarding aesthetic facial plastic surgery using a response accuracy scale and readability testing. Method: ChatGPT and Google Bard™ were asked 28 identical questions using four prompts: none, patient friendly, eighth-grade level, and references. Accuracy was assessed using Global Quality Scale (range: 1-5). Flesch-Kincaid grade level was calculated, and chatbot-provided references were analyzed for veracity. Results: Although 59.8% of responses were good quality (Global Quality Scale ≥4), ChatGPT generated more accurate responses than Google Bard™ on patient-friendly prompting (p < 0.001). Google Bard™ responses were of a significantly lower grade level than ChatGPT for all prompts (p < 0.05). Despite eighth-grade prompting, response grade level for both chatbots was high: ChatGPT (10.5 ± 1.8) and Google Bard™ (9.6 ± 1.3). Prompting for references yielded 108/108 of chatbot-generated references. Forty-one (38.0%) citations were legitimate. Twenty (18.5%) provided accurately reported information from the reference. Conclusion: Although ChatGPT produced more accurate responses and at a higher education level than Google Bard™, both chatbots provided responses above recommended grade levels for patients and failed to provide accurate references.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.70
自引率
30.00%
发文量
159
期刊最新文献
Impact of Proposed Medicare Policy Changes for Botulinum Toxin Coverage on Hemifacial Spasm and Facial Dystonia. Invited Commentary on: "Selective Neurectomy with Regenerative Peripheral Nerve Interface Surgery for Facial Synkinesis," by Gu et al. Selective Neurectomy with Regenerative Peripheral Nerve Interface Surgery for Facial Synkinesis. Comparing Perfusion of Single-Stage and Multi-Staged Paramedian Forehead Flaps Using Indocyanine Green Angiography. Lip Augmentation in Patients with Fitzpatrick Skin Type V and VI: Use of a Validated Lip Fullness Scale and Determining Preinjection Lip Size Preference and Postinjection Patient Satisfaction.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1