人工智能聊天机器人作为白内障手术患者教育材料的来源:ChatGPT-4 与 Google Bard 的对比。

IF 2 Q2 OPHTHALMOLOGY BMJ Open Ophthalmology Pub Date : 2024-10-17 DOI:10.1136/bmjophth-2024-001824
Matthew Azzopardi, Benjamin Ng, Abison Logeswaran, Constantinos Loizou, Ryan Chin Taw Cheong, Prasanth Gireesh, Darren Shu Jeng Ting, Yu Jeat Chong
{"title":"人工智能聊天机器人作为白内障手术患者教育材料的来源:ChatGPT-4 与 Google Bard 的对比。","authors":"Matthew Azzopardi, Benjamin Ng, Abison Logeswaran, Constantinos Loizou, Ryan Chin Taw Cheong, Prasanth Gireesh, Darren Shu Jeng Ting, Yu Jeat Chong","doi":"10.1136/bmjophth-2024-001824","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To conduct a head-to-head comparative analysis of cataract surgery patient education material generated by Chat Generative Pre-trained Transformer (ChatGPT-4) and Google Bard.</p><p><strong>Methods and analysis: </strong>98 frequently asked questions on cataract surgery in English were taken in November 2023 from 5 trustworthy online patient information resources. 59 of these were curated (20 augmented for clarity and 39 duplicates excluded) and categorised into 3 domains: condition (n=15), preparation for surgery (n=21) and recovery after surgery (n=23). They were formulated into input prompts with 'prompt engineering'. Using the Patient Education Materials Assessment Tool-Printable (PEMAT-P) Auto-Scoring Form, four ophthalmologists independently graded ChatGPT-4 and Google Bard responses. The readability of responses was evaluated using a Flesch-Kincaid calculator. Responses were also subjectively examined for any inaccurate or harmful information.</p><p><strong>Results: </strong>Google Bard had a higher mean overall Flesch-Kincaid Level (8.02) compared with ChatGPT-4 (5.75) (p<0.001), also noted across all three domains. ChatGPT-4 had a higher overall PEMAT-P understandability score (85.8%) in comparison to Google Bard (80.9%) (p<0.001), which was also noted in the 'preparation for cataract surgery' (85.2% vs 75.7%; p<0.001) and 'recovery after cataract surgery' (86.5% vs 82.3%; p=0.004) domains. There was no statistically significant difference in overall (42.5% vs 44.2%; p=0.344) or individual domain actionability scores (p>0.10). None of the generated material contained dangerous information.</p><p><strong>Conclusion: </strong>In comparison to Google Bard, ChatGPT-4 fared better overall, scoring higher on the PEMAT-P understandability scale and exhibiting more faithfulness to the prompt engineering instruction. Since input prompts might vary from real-world patient searches, follow-up studies with patient participation are required.</p>","PeriodicalId":9286,"journal":{"name":"BMJ Open Ophthalmology","volume":"9 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487885/pdf/","citationCount":"0","resultStr":"{\"title\":\"Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard.\",\"authors\":\"Matthew Azzopardi, Benjamin Ng, Abison Logeswaran, Constantinos Loizou, Ryan Chin Taw Cheong, Prasanth Gireesh, Darren Shu Jeng Ting, Yu Jeat Chong\",\"doi\":\"10.1136/bmjophth-2024-001824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To conduct a head-to-head comparative analysis of cataract surgery patient education material generated by Chat Generative Pre-trained Transformer (ChatGPT-4) and Google Bard.</p><p><strong>Methods and analysis: </strong>98 frequently asked questions on cataract surgery in English were taken in November 2023 from 5 trustworthy online patient information resources. 59 of these were curated (20 augmented for clarity and 39 duplicates excluded) and categorised into 3 domains: condition (n=15), preparation for surgery (n=21) and recovery after surgery (n=23). They were formulated into input prompts with 'prompt engineering'. Using the Patient Education Materials Assessment Tool-Printable (PEMAT-P) Auto-Scoring Form, four ophthalmologists independently graded ChatGPT-4 and Google Bard responses. The readability of responses was evaluated using a Flesch-Kincaid calculator. Responses were also subjectively examined for any inaccurate or harmful information.</p><p><strong>Results: </strong>Google Bard had a higher mean overall Flesch-Kincaid Level (8.02) compared with ChatGPT-4 (5.75) (p<0.001), also noted across all three domains. ChatGPT-4 had a higher overall PEMAT-P understandability score (85.8%) in comparison to Google Bard (80.9%) (p<0.001), which was also noted in the 'preparation for cataract surgery' (85.2% vs 75.7%; p<0.001) and 'recovery after cataract surgery' (86.5% vs 82.3%; p=0.004) domains. There was no statistically significant difference in overall (42.5% vs 44.2%; p=0.344) or individual domain actionability scores (p>0.10). None of the generated material contained dangerous information.</p><p><strong>Conclusion: </strong>In comparison to Google Bard, ChatGPT-4 fared better overall, scoring higher on the PEMAT-P understandability scale and exhibiting more faithfulness to the prompt engineering instruction. Since input prompts might vary from real-world patient searches, follow-up studies with patient participation are required.</p>\",\"PeriodicalId\":9286,\"journal\":{\"name\":\"BMJ Open Ophthalmology\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487885/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMJ Open Ophthalmology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1136/bmjophth-2024-001824\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Open Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjophth-2024-001824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的对由 Chat Generative Pre-trained Transformer (ChatGPT-4) 和 Google Bard 生成的白内障手术患者教育材料进行正面对比分析。方法与分析:2023 年 11 月,从 5 个值得信赖的在线患者信息资源中提取了 98 个有关白内障手术的英语常见问题。对其中的 59 个问题进行了整理(为清晰起见增加了 20 个问题,剔除了 39 个重复问题),并将其分为 3 个领域:病情(n=15)、手术准备(n=21)和术后恢复(n=23)。这些内容被编制成带有 "提示工程 "的输入提示。四位眼科医生使用患者教育材料评估工具--可打印(PEMAT-P)自动评分表对 ChatGPT-4 和 Google Bard 的回复进行了独立评分。使用 Flesch-Kincaid 计算器对回复的可读性进行了评估。此外,还主观检查了回复中是否存在不准确或有害信息:结果:Google Bard 的 Flesch-Kincaid 总平均水平(8.02)高于 ChatGPT-4(5.75)(P0.10)。生成的材料中没有一个包含危险信息:结论:与谷歌巴德相比,ChatGPT-4 的总体表现更好,在 PEMAT-P 可理解度量表中得分更高,而且更忠实于工程提示指令。由于输入提示可能与现实世界中患者的搜索有所不同,因此需要在患者参与的情况下进行后续研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard.

Objective: To conduct a head-to-head comparative analysis of cataract surgery patient education material generated by Chat Generative Pre-trained Transformer (ChatGPT-4) and Google Bard.

Methods and analysis: 98 frequently asked questions on cataract surgery in English were taken in November 2023 from 5 trustworthy online patient information resources. 59 of these were curated (20 augmented for clarity and 39 duplicates excluded) and categorised into 3 domains: condition (n=15), preparation for surgery (n=21) and recovery after surgery (n=23). They were formulated into input prompts with 'prompt engineering'. Using the Patient Education Materials Assessment Tool-Printable (PEMAT-P) Auto-Scoring Form, four ophthalmologists independently graded ChatGPT-4 and Google Bard responses. The readability of responses was evaluated using a Flesch-Kincaid calculator. Responses were also subjectively examined for any inaccurate or harmful information.

Results: Google Bard had a higher mean overall Flesch-Kincaid Level (8.02) compared with ChatGPT-4 (5.75) (p<0.001), also noted across all three domains. ChatGPT-4 had a higher overall PEMAT-P understandability score (85.8%) in comparison to Google Bard (80.9%) (p<0.001), which was also noted in the 'preparation for cataract surgery' (85.2% vs 75.7%; p<0.001) and 'recovery after cataract surgery' (86.5% vs 82.3%; p=0.004) domains. There was no statistically significant difference in overall (42.5% vs 44.2%; p=0.344) or individual domain actionability scores (p>0.10). None of the generated material contained dangerous information.

Conclusion: In comparison to Google Bard, ChatGPT-4 fared better overall, scoring higher on the PEMAT-P understandability scale and exhibiting more faithfulness to the prompt engineering instruction. Since input prompts might vary from real-world patient searches, follow-up studies with patient participation are required.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMJ Open Ophthalmology
BMJ Open Ophthalmology OPHTHALMOLOGY-
CiteScore
3.40
自引率
4.20%
发文量
104
审稿时长
20 weeks
期刊最新文献
Investigation of choroidal vascular alterations in eyes with myopia using ultrawidefield optical coherence tomography angiography. Recent advances in the application of artificial intelligence in age-related macular degeneration. Network-based hub biomarker discovery for glaucoma. Investigating the effects of simulated high altitude on colour discrimination. Total retinal thickness is an important factor in evaluating diabetic retinal neurodegeneration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1