Unlocking the future of patient Education: ChatGPT vs. LexiComp® as sources of patient education materials

IF 2.5 4区医学 Q3 PHARMACOLOGY & PHARMACY Journal of the American Pharmacists Association Pub Date : 2025-01-01 DOI:10.1016/j.japh.2024.102119

Elizabeth W. Covington, Courtney S. Watts Alexander, Jeanna Sewell, Amber M. Hutchison, Julie Kay, Lucy Tocco, Melanie Hyte

{"title":"Unlocking the future of patient Education: ChatGPT vs. LexiComp® as sources of patient education materials","authors":"Elizabeth W. Covington, Courtney S. Watts Alexander, Jeanna Sewell, Amber M. Hutchison, Julie Kay, Lucy Tocco, Melanie Hyte","doi":"10.1016/j.japh.2024.102119","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>ChatGPT is a conversational artificial intelligence technology that has shown application in various facets of healthcare. With the increased use of AI, it is imperative to assess the accuracy and comprehensibility of AI platforms.</div></div><div><h3>Objective</h3><div>This pilot project aimed to assess the understandability, readability, and accuracy of ChatGPT as a source of medication-related patient education as compared with an evidence-based medicine tertiary reference resource, LexiComp®.</div></div><div><h3>Methods</h3><div>Patient education materials (PEMs) were obtained from ChatGPT and LexiComp® for 8 common medications (albuterol, apixaban, atorvastatin, hydrocodone/acetaminophen, insulin glargine, levofloxacin, omeprazole, and sacubitril/valsartan). PEMs were extracted, blinded, and assessed by 2 investigators independently. The primary outcome was a comparison of the Patient Education Materials Assessment Tool-printable (PEMAT-P). Secondary outcomes included Flesch reading ease, Flesch Kincaid grade level, percent passive sentences, word count, and accuracy. A 7-item accuracy checklist for each medication was generated by expert consensus among pharmacist investigators, with LexiComp® PEMs serving as the control. PEMAT-P interrater reliability was determined via intraclass correlation coefficient (ICC). Flesch reading ease, Flesch Kincaid grade level, percent passive sentences, and word count were calculated by Microsoft® Word®. Continuous data were assessed using the Student’s t-test via SPSS (version 20.0).</div></div><div><h3>Results</h3><div>No difference was found in the PEMAT-P understandability score of PEMs produced by ChatGPT versus LexiComp® [77.9% (11.0) vs. 72.5% (2.4), <em>P</em>=0.193]. Reading level was higher with ChatGPT [8.6 (1.2) vs. 5.6 (0.3), <em>P</em> < 0.001). ChatGPT PEMs had a lower percentage of passive sentences and lower word count. The average accuracy score of ChatGPT PEMs was 4.25/7 (61%), with scores ranging from 29% to 86%.</div></div><div><h3>Conclusion</h3><div>Despite comparable PEMAT-P scores, ChatGPT PEMs did not meet grade level targets. Lower word count and passive text with ChatGPT PEMs could benefit patients, but the variable accuracy scores prevent routine use of ChatGPT to produce medication-related PEMs at this time.</div></div>","PeriodicalId":50015,"journal":{"name":"Journal of the American Pharmacists Association","volume":"65 1","pages":"Article 102119"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Pharmacists Association","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1544319124001390","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

ChatGPT is a conversational artificial intelligence technology that has shown application in various facets of healthcare. With the increased use of AI, it is imperative to assess the accuracy and comprehensibility of AI platforms.

Objective

This pilot project aimed to assess the understandability, readability, and accuracy of ChatGPT as a source of medication-related patient education as compared with an evidence-based medicine tertiary reference resource, LexiComp®.

Methods

Patient education materials (PEMs) were obtained from ChatGPT and LexiComp® for 8 common medications (albuterol, apixaban, atorvastatin, hydrocodone/acetaminophen, insulin glargine, levofloxacin, omeprazole, and sacubitril/valsartan). PEMs were extracted, blinded, and assessed by 2 investigators independently. The primary outcome was a comparison of the Patient Education Materials Assessment Tool-printable (PEMAT-P). Secondary outcomes included Flesch reading ease, Flesch Kincaid grade level, percent passive sentences, word count, and accuracy. A 7-item accuracy checklist for each medication was generated by expert consensus among pharmacist investigators, with LexiComp® PEMs serving as the control. PEMAT-P interrater reliability was determined via intraclass correlation coefficient (ICC). Flesch reading ease, Flesch Kincaid grade level, percent passive sentences, and word count were calculated by Microsoft® Word®. Continuous data were assessed using the Student’s t-test via SPSS (version 20.0).

Results

No difference was found in the PEMAT-P understandability score of PEMs produced by ChatGPT versus LexiComp® [77.9% (11.0) vs. 72.5% (2.4), P=0.193]. Reading level was higher with ChatGPT [8.6 (1.2) vs. 5.6 (0.3), P < 0.001). ChatGPT PEMs had a lower percentage of passive sentences and lower word count. The average accuracy score of ChatGPT PEMs was 4.25/7 (61%), with scores ranging from 29% to 86%.

Conclusion

Despite comparable PEMAT-P scores, ChatGPT PEMs did not meet grade level targets. Lower word count and passive text with ChatGPT PEMs could benefit patients, but the variable accuracy scores prevent routine use of ChatGPT to produce medication-related PEMs at this time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

开启患者教育的未来：作为患者教育材料来源的 ChatGPT 与 LexiComp®。

背景介绍ChatGPT 是一种对话式人工智能（AI）技术，已在医疗保健的各个方面得到应用。随着人工智能应用的增加，对人工智能平台的准确性和可理解性进行评估势在必行：本试点项目旨在评估 ChatGPT 作为药物相关患者教育资源与循证医学三级参考资源 LexiComp® 相比的可理解性、可读性和准确性：从 ChatGPT 和 LexiComp® 中获取了八种常见药物（阿布特罗、阿哌沙班、阿托伐他汀、氢可酮/对乙酰氨基酚、格列酮胰岛素、左氧氟沙星、奥美拉唑和沙库比曲利/缬沙坦）的患者教育资料（PEMs）。由两名研究人员独立提取、盲测和评估 PEM。主要结果是患者教育资料评估工具-可打印版（PEMAT-P）的比较。次要结果包括弗莱什阅读难易度、弗莱什-金凯德等级、被动句百分比、字数和准确性。每种药物的 7 项准确性核对表由药剂师研究人员通过专家共识生成，LexiComp® PEMs 作为对照。通过类内相关系数 (ICC) 测定 PEMAT-P 交互可靠性。Flesch 阅读难易度、Flesch Kincaid 等级、被动句百分比和字数由 Microsoft® Word® 计算。连续数据通过 SPSS（20.0 版）的学生 t 检验进行评估：结果：ChatGPT 与 LexiComp® 制作的 PEM 的 PEMAT-P 可理解性得分没有差异[77.9% (11.0) vs. 72.5% (2.4)，P=0.193]。ChatGPT 的阅读水平更高[8.6 (1.2) vs. 5.6 (0.3)，P=0.193]：尽管 PEMAT-P 分数相当，但 ChatGPT PEM 未达到年级目标。ChatGPT PEM 的字数较少，文字被动，这可能会使患者受益，但由于准确性分数不一，目前还不能常规使用 ChatGPT 制作药物相关的 PEM。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of the American Pharmacists Association 医学-药学

CiteScore

3.30

自引率

14.30%

发文量

336

审稿时长

46 days

期刊介绍： The Journal of the American Pharmacists Association is the official peer-reviewed journal of the American Pharmacists Association (APhA), providing information on pharmaceutical care, drug therapy, diseases and other health issues, trends in pharmacy practice and therapeutics, informed opinion, and original research. JAPhA publishes original research, reviews, experiences, and opinion articles that link science to contemporary pharmacy practice to improve patient care.