ChatGPT as a patient education tool in colorectal cancer—An in-depth assessment of efficacy, quality and readability

IF 2.9 3区 医学 Q2 GASTROENTEROLOGY & HEPATOLOGY Colorectal Disease Pub Date : 2024-12-17 DOI:10.1111/codi.17267
Adrian H. Y. Siu, Damien P. Gibson, Chris Chiu, Allan Kwok, Matt Irwin, Adam Christie, Cherry E. Koh, Anil Keshava, Mifanwy Reece, Michael Suen, Matthew J. F. X. Rickard
{"title":"ChatGPT as a patient education tool in colorectal cancer—An in-depth assessment of efficacy, quality and readability","authors":"Adrian H. Y. Siu,&nbsp;Damien P. Gibson,&nbsp;Chris Chiu,&nbsp;Allan Kwok,&nbsp;Matt Irwin,&nbsp;Adam Christie,&nbsp;Cherry E. Koh,&nbsp;Anil Keshava,&nbsp;Mifanwy Reece,&nbsp;Michael Suen,&nbsp;Matthew J. F. X. Rickard","doi":"10.1111/codi.17267","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Aim</h3>\n \n <p>Artificial intelligence (AI) chatbots such as Chat Generative Pretrained Transformer-4 (ChatGPT-4) have made significant strides in generating human-like responses. Trained on an extensive corpus of medical literature, ChatGPT-4 has the potential to augment patient education materials. These chatbots may be beneficial to populations considering a diagnosis of colorectal cancer (CRC). However, the accuracy and quality of patient education materials are crucial for informed decision-making. Given workforce demands impacting holistic care, AI chatbots can bridge gaps in CRC information, reaching wider demographics and crossing language barriers. However, rigorous evaluation is essential to ensure accuracy, quality and readability. Therefore, this study aims to evaluate the efficacy, quality and readability of answers generated by ChatGPT-4 on CRC, utilizing patient-style question prompts.</p>\n </section>\n \n <section>\n \n <h3> Method</h3>\n \n <p>To evaluate ChatGPT-4, eight CRC-related questions were derived using peer-reviewed literature and Google Trends. Eight colorectal surgeons evaluated AI responses for accuracy, safety, appropriateness, actionability and effectiveness. Quality was assessed using validated tools: the Patient Education Materials Assessment Tool (PEMAT-AI), modified DISCERN (DISCERN-AI) and Global Quality Score (GQS). A number of readability assessments were measured including Flesch Reading Ease (FRE) and the Gunning Fog Index (GFI).</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The responses were generally accurate (median 4.00), safe (4.25), appropriate (4.00), actionable (4.00) and effective (4.00). Quality assessments rated PEMAT-AI as ‘very good’ (71.43), DISCERN-AI as ‘fair’ (12.00) and GQS as ‘high’ (4.00). Readability scores indicated difficulty (FRE 47.00, GFI 12.40), suggesting a higher educational level was required.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>This study concludes that ChatGPT-4 is capable of providing safe but nonspecific medical information, suggesting its potential as a patient education aid. However, enhancements in readability through contextual prompting and fine-tuning techniques are required before considering implementation into clinical practice.</p>\n </section>\n </div>","PeriodicalId":10512,"journal":{"name":"Colorectal Disease","volume":"27 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Colorectal Disease","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/codi.17267","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Aim

Artificial intelligence (AI) chatbots such as Chat Generative Pretrained Transformer-4 (ChatGPT-4) have made significant strides in generating human-like responses. Trained on an extensive corpus of medical literature, ChatGPT-4 has the potential to augment patient education materials. These chatbots may be beneficial to populations considering a diagnosis of colorectal cancer (CRC). However, the accuracy and quality of patient education materials are crucial for informed decision-making. Given workforce demands impacting holistic care, AI chatbots can bridge gaps in CRC information, reaching wider demographics and crossing language barriers. However, rigorous evaluation is essential to ensure accuracy, quality and readability. Therefore, this study aims to evaluate the efficacy, quality and readability of answers generated by ChatGPT-4 on CRC, utilizing patient-style question prompts.

Method

To evaluate ChatGPT-4, eight CRC-related questions were derived using peer-reviewed literature and Google Trends. Eight colorectal surgeons evaluated AI responses for accuracy, safety, appropriateness, actionability and effectiveness. Quality was assessed using validated tools: the Patient Education Materials Assessment Tool (PEMAT-AI), modified DISCERN (DISCERN-AI) and Global Quality Score (GQS). A number of readability assessments were measured including Flesch Reading Ease (FRE) and the Gunning Fog Index (GFI).

Results

The responses were generally accurate (median 4.00), safe (4.25), appropriate (4.00), actionable (4.00) and effective (4.00). Quality assessments rated PEMAT-AI as ‘very good’ (71.43), DISCERN-AI as ‘fair’ (12.00) and GQS as ‘high’ (4.00). Readability scores indicated difficulty (FRE 47.00, GFI 12.40), suggesting a higher educational level was required.

Conclusion

This study concludes that ChatGPT-4 is capable of providing safe but nonspecific medical information, suggesting its potential as a patient education aid. However, enhancements in readability through contextual prompting and fine-tuning techniques are required before considering implementation into clinical practice.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT作为结直肠癌患者教育工具的疗效、质量和可读性的深入评估
目的:人工智能(AI)聊天机器人,如聊天生成预训练变形金刚4 (ChatGPT-4),在生成类似人类的反应方面取得了重大进展。在广泛的医学文献语料库上训练,ChatGPT-4具有增加患者教育材料的潜力。这些聊天机器人可能对考虑诊断结直肠癌(CRC)的人群有益。然而,患者教育材料的准确性和质量对知情决策至关重要。考虑到影响整体护理的劳动力需求,人工智能聊天机器人可以弥合CRC信息的差距,覆盖更广泛的人口统计数据并跨越语言障碍。然而,严格的评估是必不可少的,以确保准确性,质量和可读性。因此,本研究旨在利用患者式问题提示,评估ChatGPT-4生成的CRC答案的有效性、质量和可读性。方法:利用同行评议文献和谷歌Trends得出8个crc相关问题,对ChatGPT-4进行评估。8位结直肠外科医生对人工智能反应的准确性、安全性、适宜性、可操作性和有效性进行了评估。使用经过验证的工具进行质量评估:患者教育材料评估工具(PEMAT-AI)、改良的DISCERN (DISCERN- ai)和全球质量评分(GQS)。测量了一些可读性评估,包括Flesch Reading Ease (FRE)和Gunning Fog Index (GFI)。结果:反应总体准确(中位数4.00)、安全(中位数4.25)、适宜(中位数4.00)、可操作(中位数4.00)、有效(中位数4.00)。质量评估将PEMAT-AI评为“非常好”(71.43),分辨力- ai为“一般”(12.00),GQS为“高”(4.00)。可读性分数表示难度(FRE 47.00, GFI 12.40),表明需要较高的教育水平。结论:本研究表明,ChatGPT-4能够提供安全但非特异性的医学信息,表明其作为患者教育辅助工具的潜力。然而,在考虑实施到临床实践之前,需要通过上下文提示和微调技术来增强可读性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Colorectal Disease
Colorectal Disease 医学-胃肠肝病学
CiteScore
6.10
自引率
11.80%
发文量
406
审稿时长
1.5 months
期刊介绍: Diseases of the colon and rectum are common and offer a number of exciting challenges. Clinical, diagnostic and basic science research is expanding rapidly. There is increasing demand from purchasers of health care and patients for clinicians to keep abreast of the latest research and developments, and to translate these into routine practice. Technological advances in diagnosis, surgical technique, new pharmaceuticals, molecular genetics and other basic sciences have transformed many aspects of how these diseases are managed. Such progress will accelerate. Colorectal Disease offers a real benefit to subscribers and authors. It is first and foremost a vehicle for publishing original research relating to the demanding, rapidly expanding field of colorectal diseases. Essential for surgeons, pathologists, oncologists, gastroenterologists and health professionals caring for patients with a disease of the lower GI tract, Colorectal Disease furthers education and inter-professional development by including regular review articles and discussions of current controversies. Note that the journal does not usually accept paediatric surgical papers.
期刊最新文献
Evaluation of pelvic floor rehabilitation in the prevention of low anterior resection syndrome: Study protocol of the CONTICARE trial Supine extralevator abdominoperineal resection (ELAPE), technique and outcome—a video vignette Laparascopic total proctocolectomy with ileal pouch–anal anastomosis: A video vignette The validation of a simple and instrument-free technique to measure the depth of the natal cleft (a cohort study) High-grade intraepithelial lesions of the anus—patience: A road to wisdom
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1