Andres A Abreu, Gilbert Z Murimwa, Emile Farah, James W Stewart, Lucia Zhang, Jonathan Rodriguez, John Sweetenham, Herbert J Zeh, Sam C Wang, Patricio M Polanco
{"title":"提高面向患者的在线内容的可读性:人工智能聊天机器人在提高癌症信息可读性中的作用》。","authors":"Andres A Abreu, Gilbert Z Murimwa, Emile Farah, James W Stewart, Lucia Zhang, Jonathan Rodriguez, John Sweetenham, Herbert J Zeh, Sam C Wang, Patricio M Polanco","doi":"10.6004/jnccn.2023.7334","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Internet-based health education is increasingly vital in patient care. However, the readability of online information often exceeds the average reading level of the US population, limiting accessibility and comprehension. This study investigates the use of chatbot artificial intelligence to improve the readability of cancer-related patient-facing content.</p><p><strong>Methods: </strong>We used ChatGPT 4.0 to rewrite content about breast, colon, lung, prostate, and pancreas cancer across 34 websites associated with NCCN Member Institutions. Readability was analyzed using Fry Readability Score, Flesch-Kincaid Grade Level, Gunning Fog Index, and Simple Measure of Gobbledygook. The primary outcome was the mean readability score for the original and artificial intelligence (AI)-generated content. As secondary outcomes, we assessed the accuracy, similarity, and quality using F1 scores, cosine similarity scores, and section 2 of the DISCERN instrument, respectively.</p><p><strong>Results: </strong>The mean readability level across the 34 websites was equivalent to a university freshman level (grade 13±1.5). However, after ChatGPT's intervention, the AI-generated outputs had a mean readability score equivalent to a high school freshman education level (grade 9±0.8). The overall F1 score for the rewritten content was 0.87, the precision score was 0.934, and the recall score was 0.814. Compared with their original counterparts, the AI-rewritten content had a cosine similarity score of 0.915 (95% CI, 0.908-0.922). The improved readability was attributed to simpler words and shorter sentences. The mean DISCERN score of the random sample of AI-generated content was equivalent to \"good\" (28.5±5), with no significant differences compared with their original counterparts.</p><p><strong>Conclusions: </strong>Our study demonstrates the potential of AI chatbots to improve the readability of patient-facing content while maintaining content quality. The decrease in requisite literacy after AI revision emphasizes the potential of this technology to reduce health care disparities caused by a mismatch between educational resources available to a patient and their health literacy.</p>","PeriodicalId":17483,"journal":{"name":"Journal of the National Comprehensive Cancer Network","volume":" ","pages":""},"PeriodicalIF":14.8000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Readability of Online Patient-Facing Content: The Role of AI Chatbots in Improving Cancer Information Accessibility.\",\"authors\":\"Andres A Abreu, Gilbert Z Murimwa, Emile Farah, James W Stewart, Lucia Zhang, Jonathan Rodriguez, John Sweetenham, Herbert J Zeh, Sam C Wang, Patricio M Polanco\",\"doi\":\"10.6004/jnccn.2023.7334\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Internet-based health education is increasingly vital in patient care. However, the readability of online information often exceeds the average reading level of the US population, limiting accessibility and comprehension. This study investigates the use of chatbot artificial intelligence to improve the readability of cancer-related patient-facing content.</p><p><strong>Methods: </strong>We used ChatGPT 4.0 to rewrite content about breast, colon, lung, prostate, and pancreas cancer across 34 websites associated with NCCN Member Institutions. Readability was analyzed using Fry Readability Score, Flesch-Kincaid Grade Level, Gunning Fog Index, and Simple Measure of Gobbledygook. The primary outcome was the mean readability score for the original and artificial intelligence (AI)-generated content. As secondary outcomes, we assessed the accuracy, similarity, and quality using F1 scores, cosine similarity scores, and section 2 of the DISCERN instrument, respectively.</p><p><strong>Results: </strong>The mean readability level across the 34 websites was equivalent to a university freshman level (grade 13±1.5). However, after ChatGPT's intervention, the AI-generated outputs had a mean readability score equivalent to a high school freshman education level (grade 9±0.8). The overall F1 score for the rewritten content was 0.87, the precision score was 0.934, and the recall score was 0.814. Compared with their original counterparts, the AI-rewritten content had a cosine similarity score of 0.915 (95% CI, 0.908-0.922). The improved readability was attributed to simpler words and shorter sentences. The mean DISCERN score of the random sample of AI-generated content was equivalent to \\\"good\\\" (28.5±5), with no significant differences compared with their original counterparts.</p><p><strong>Conclusions: </strong>Our study demonstrates the potential of AI chatbots to improve the readability of patient-facing content while maintaining content quality. The decrease in requisite literacy after AI revision emphasizes the potential of this technology to reduce health care disparities caused by a mismatch between educational resources available to a patient and their health literacy.</p>\",\"PeriodicalId\":17483,\"journal\":{\"name\":\"Journal of the National Comprehensive Cancer Network\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":14.8000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the National Comprehensive Cancer Network\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.6004/jnccn.2023.7334\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the National Comprehensive Cancer Network","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.6004/jnccn.2023.7334","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:基于互联网的健康教育在病人护理中越来越重要。然而,在线信息的可读性往往超过美国人口的平均阅读水平,限制了信息的获取和理解。本研究调查了聊天机器人人工智能的使用情况,以提高面向患者的癌症相关内容的可读性:我们使用 ChatGPT 4.0 重写了与 NCCN 成员机构相关的 34 个网站中有关乳腺癌、结肠癌、肺癌、前列腺癌和胰腺癌的内容。使用弗莱可读性评分、弗莱什-金凯德等级水平、Gunning Fog Index 和 Simple Measure of Gobbledygook 对可读性进行了分析。主要结果是原始内容和人工智能(AI)生成内容的平均可读性得分。作为次要结果,我们分别使用 F1 分数、余弦相似度分数和 DISCERN 工具的第 2 部分来评估准确性、相似性和质量:结果:34 个网站的平均可读性水平相当于大学新生水平(13±1.5 级)。然而,在 ChatGPT 的干预下,人工智能生成的输出结果的平均可读性得分相当于高中一年级学生的水平(9±0.8 级)。改写内容的总体 F1 得分为 0.87,精确度得分为 0.934,召回得分为 0.814。与原始内容相比,人工智能改写内容的余弦相似度为 0.915(95% CI,0.908-0.922)。可读性提高的原因是用词更简单,句子更短。随机抽样的人工智能生成内容的平均 DISCERN 分数相当于 "好"(28.5±5),与原始内容相比没有显著差异:我们的研究表明,人工智能聊天机器人有潜力在保持内容质量的同时提高面向患者的内容的可读性。经过人工智能修改后,所需文化水平有所下降,这强调了该技术在减少因患者可用教育资源与他们的健康文化水平不匹配而造成的医疗差距方面的潜力。
Enhancing Readability of Online Patient-Facing Content: The Role of AI Chatbots in Improving Cancer Information Accessibility.
Background: Internet-based health education is increasingly vital in patient care. However, the readability of online information often exceeds the average reading level of the US population, limiting accessibility and comprehension. This study investigates the use of chatbot artificial intelligence to improve the readability of cancer-related patient-facing content.
Methods: We used ChatGPT 4.0 to rewrite content about breast, colon, lung, prostate, and pancreas cancer across 34 websites associated with NCCN Member Institutions. Readability was analyzed using Fry Readability Score, Flesch-Kincaid Grade Level, Gunning Fog Index, and Simple Measure of Gobbledygook. The primary outcome was the mean readability score for the original and artificial intelligence (AI)-generated content. As secondary outcomes, we assessed the accuracy, similarity, and quality using F1 scores, cosine similarity scores, and section 2 of the DISCERN instrument, respectively.
Results: The mean readability level across the 34 websites was equivalent to a university freshman level (grade 13±1.5). However, after ChatGPT's intervention, the AI-generated outputs had a mean readability score equivalent to a high school freshman education level (grade 9±0.8). The overall F1 score for the rewritten content was 0.87, the precision score was 0.934, and the recall score was 0.814. Compared with their original counterparts, the AI-rewritten content had a cosine similarity score of 0.915 (95% CI, 0.908-0.922). The improved readability was attributed to simpler words and shorter sentences. The mean DISCERN score of the random sample of AI-generated content was equivalent to "good" (28.5±5), with no significant differences compared with their original counterparts.
Conclusions: Our study demonstrates the potential of AI chatbots to improve the readability of patient-facing content while maintaining content quality. The decrease in requisite literacy after AI revision emphasizes the potential of this technology to reduce health care disparities caused by a mismatch between educational resources available to a patient and their health literacy.
期刊介绍:
JNCCN—Journal of the National Comprehensive Cancer Network is a peer-reviewed medical journal read by over 25,000 oncologists and cancer care professionals nationwide. This indexed publication delivers the latest insights into best clinical practices, oncology health services research, and translational medicine. Notably, JNCCN provides updates on the NCCN Clinical Practice Guidelines in Oncology® (NCCN Guidelines®), review articles elaborating on guideline recommendations, health services research, and case reports that spotlight molecular insights in patient care.
Guided by its vision, JNCCN seeks to advance the mission of NCCN by serving as the primary resource for information on NCCN Guidelines®, innovation in translational medicine, and scientific studies related to oncology health services research. This encompasses quality care and value, bioethics, comparative and cost effectiveness, public policy, and interventional research on supportive care and survivorship.
JNCCN boasts indexing by prominent databases such as MEDLINE/PubMed, Chemical Abstracts, Embase, EmCare, and Scopus, reinforcing its standing as a reputable source for comprehensive information in the field of oncology.