Eyes on the Text: Assessing Readability of AI & Ophthalmologist Responses to Patient Surgery Queries.

IF 2.1 4区 医学 Q2 OPHTHALMOLOGY Ophthalmologica Pub Date : 2025-03-10 DOI:10.1159/000544917
Sai S Kurapati, Derek J Barnett, Antonio Yaghy, Cameron J Sabet, David N Younessi, Dang Nguyen, John C Lin, Ingrid U Scott
{"title":"Eyes on the Text: Assessing Readability of AI & Ophthalmologist Responses to Patient Surgery Queries.","authors":"Sai S Kurapati, Derek J Barnett, Antonio Yaghy, Cameron J Sabet, David N Younessi, Dang Nguyen, John C Lin, Ingrid U Scott","doi":"10.1159/000544917","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Generative artificial intelligence (AI) technologies like GPT-4 can instantaneously provide health information to patients; however, the readability of these outputs compared to ophthalmologist-written responses is unknown. This study aims to evaluate the readability of GPT-4-generated and ophthalmologist-written responses to patient queries about ophthalmic surgery.</p><p><strong>Methods: </strong>This retrospective cross-sectional study used 200 randomly selected patient questions about ophthalmic surgery extracted from the American Academy of Ophthalmology's EyeSmart platform. The questions were inputted into GPT-4, and the generated responses were recorded. Ophthalmologist-written replies to the same questions were compiled for comparison. Readability of GPT-4 and ophthalmologist responses was assessed using six validated metrics: Flesch Kincaid Reading Ease (FK-RE), Flesch Kincaid Grade Level (FK-GL), Gunning Fog Score (GFS), SMOG Index (SI), Coleman Liau Index (CLI), and Automated Readability Index (ARI). Descriptive statistics, one-way ANOVA, Shapiro-Wilk, and Levene's tests (α=0.05) were used to compare readability between the two groups.</p><p><strong>Results: </strong>GPT-4 used a higher percentage of complex words (24.42%) compared to ophthalmologists (17.76%), although mean [SD] word count per sentence was similar (18.43 [2.95] and 18.01 [6.09]). Across all metrics (FK-RE; FK-GL; GFS; SI; CLI; and ARI), GPT-4 responses were at a higher grade level (34.39 [8.51]; 13.19 [2.63]; 16.37 [2.04]; 12.18 [1.43]; 15.72 [1.40]; 12.99 [1.86]) than ophthalmologists' responses (50.61 [15.53]; 10.71 [2.99]; 14.13 [3.55]; 10.07 [2.46]; 12.64 [2.93]; 10.40 [3.61]), with both sources necessitating a 12th-grade education for comprehension. ANOVA tests showed significance (p<0.05) for all comparisons except word count (p=0.438).</p><p><strong>Conclusions: </strong>The National Institutes of Health advises health information to be written at a sixth-seventh grade level. Both GPT-4- and ophthalmologist-written answers exceeded this recommendation, with GPT-4 showing a greater gap. Information accessibility is vital when designing patient resources, particularly with the rise of AI as an educational tool.</p>","PeriodicalId":19595,"journal":{"name":"Ophthalmologica","volume":" ","pages":"1-18"},"PeriodicalIF":2.1000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmologica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000544917","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Generative artificial intelligence (AI) technologies like GPT-4 can instantaneously provide health information to patients; however, the readability of these outputs compared to ophthalmologist-written responses is unknown. This study aims to evaluate the readability of GPT-4-generated and ophthalmologist-written responses to patient queries about ophthalmic surgery.

Methods: This retrospective cross-sectional study used 200 randomly selected patient questions about ophthalmic surgery extracted from the American Academy of Ophthalmology's EyeSmart platform. The questions were inputted into GPT-4, and the generated responses were recorded. Ophthalmologist-written replies to the same questions were compiled for comparison. Readability of GPT-4 and ophthalmologist responses was assessed using six validated metrics: Flesch Kincaid Reading Ease (FK-RE), Flesch Kincaid Grade Level (FK-GL), Gunning Fog Score (GFS), SMOG Index (SI), Coleman Liau Index (CLI), and Automated Readability Index (ARI). Descriptive statistics, one-way ANOVA, Shapiro-Wilk, and Levene's tests (α=0.05) were used to compare readability between the two groups.

Results: GPT-4 used a higher percentage of complex words (24.42%) compared to ophthalmologists (17.76%), although mean [SD] word count per sentence was similar (18.43 [2.95] and 18.01 [6.09]). Across all metrics (FK-RE; FK-GL; GFS; SI; CLI; and ARI), GPT-4 responses were at a higher grade level (34.39 [8.51]; 13.19 [2.63]; 16.37 [2.04]; 12.18 [1.43]; 15.72 [1.40]; 12.99 [1.86]) than ophthalmologists' responses (50.61 [15.53]; 10.71 [2.99]; 14.13 [3.55]; 10.07 [2.46]; 12.64 [2.93]; 10.40 [3.61]), with both sources necessitating a 12th-grade education for comprehension. ANOVA tests showed significance (p<0.05) for all comparisons except word count (p=0.438).

Conclusions: The National Institutes of Health advises health information to be written at a sixth-seventh grade level. Both GPT-4- and ophthalmologist-written answers exceeded this recommendation, with GPT-4 showing a greater gap. Information accessibility is vital when designing patient resources, particularly with the rise of AI as an educational tool.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Ophthalmologica
Ophthalmologica 医学-眼科学
CiteScore
5.10
自引率
3.80%
发文量
39
审稿时长
3 months
期刊介绍: Published since 1899, ''Ophthalmologica'' has become a frequently cited guide to international work in clinical and experimental ophthalmology. It contains a selection of patient-oriented contributions covering the etiology of eye diseases, diagnostic techniques, and advances in medical and surgical treatment. Straightforward, factual reporting provides both interesting and useful reading. In addition to original papers, ''Ophthalmologica'' features regularly timely reviews in an effort to keep the reader well informed and updated. The large international circulation of this journal reflects its importance.
期刊最新文献
Eyes on the Text: Assessing Readability of AI & Ophthalmologist Responses to Patient Surgery Queries. Infectious causes of retinal vasculitis: Causes, presentation, differentiation and therapy. Networking of Eye Hospitals on a National Level for Promoting and Advancing Conduct of Clinical Trials. Surgical outcomes of vitrectomy with gas or silicone oil tamponade for giant retinal tears. Recalcitrant Macular Edema after Pseudophakic Rhegmatogenous Retinal Detachment Repair: Risk Factors and Response to Intravitreal Dexamethasone Implant.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1