Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.

IF 4.2 2区 医学 Q1 ORTHOPEDICS Clinical Orthopaedics and Related Research® Pub Date : 2025-02-11 DOI:10.1097/CORR.0000000000003413
Rodnell Busigó Torres, Mariana Restrepo, Brocha Z Stern, B Israel Yahuaca, Rafael A Buerba, Ivan A García, Victor H Hernandez, Ronald A Navarro
{"title":"Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.","authors":"Rodnell Busigó Torres, Mariana Restrepo, Brocha Z Stern, B Israel Yahuaca, Rafael A Buerba, Ivan A García, Victor H Hernandez, Ronald A Navarro","doi":"10.1097/CORR.0000000000003413","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The more than 41 million people in the United States who speak Spanish represent one of the fastest-growing US populations. Non-English-speaking patients often face poorer health outcomes because of language barriers that hinder patient education. Orthopaedic education materials have limited availability in Spanish and may be difficult for some patients to read. The American Academy of Orthopaedic Surgeons (AAOS) has translated education materials into Spanish, but their readability levels remain unknown. Additionally, although artificial intelligence (AI) dialogue platforms have been shown to improve readability in English, no studies have specifically evaluated their effectiveness in non-English languages.</p><p><strong>Questions/purposes: </strong>(1) What is the readability of AAOS Spanish-language education materials? (2) Can an AI dialogue platform improve the readability of Spanish-language education materials while maintaining their accuracy and usefulness?</p><p><strong>Methods: </strong>After excluding COVID-19 articles and inaccessible websites, Spanish-language education materials were extracted from the AAOS OrthoInfo website, and their Fernández-Huerta and Spanish Orthographic Length (SOL) readability grade levels were calculated. Fernández-Huerta focuses on syntactic complexity (sentence and syllable structure) and SOL assesses lexical complexity (word length and frequency). For both, the higher the grade level, the harder it is to read. Education materials with a reading level above the sixth-grade level were inputted into the ChatGPT-4 AI platform to be adapted to a fifth-grade level. Readability metrics of the adaptations were reassessed and compared with the original versions. Secondarily, one of four Spanish-speaking orthopaedic surgeons evaluated each AI-adapted education material for accuracy and usefulness compared with the original version. We used a single review per material, trusting the orthopaedic surgeon's expertise to minimize discrepancies. We included a total of 77 of 82 education materials covering topics like diseases and conditions, treatment, and recovery and staying healthy.</p><p><strong>Results: </strong>Before AI adaptations, none of the 77 education materials met the recommended reading level of sixth grade or below according to both readability formulas. The original education materials were written at a seventh- to eighth-grade reading level in 32% of cases (25 of 77). In comparison, after a single attempt at simplification, AI-adapted materials achieved this reading level in 53% of cases (41 of 77; p < 0.001). Only 23% (18) and 16% (12) of the AI adaptations were written at or below the recommended sixth-grade level per the Fernández-Huerta and SOL grade levels, respectively. Of the AI adaptations, 52% (40) were rated as accurate and 56% (43) were rated as useful for patient education by the evaluating orthopaedic surgeons. AI adaptations that were classified as accurate or useful had a higher median (IQR) word count than those that were inaccurate (accurate 255 [216 to 331] versus inaccurate 236 [209 to 256]; p = 0.04) or not useful (useful 257 [216 to 337] versus not useful 233 [209 to 251]; p = 0.01).</p><p><strong>Conclusion: </strong>Ongoing attention is needed to improve the readability of Spanish education materials to reduce health disparities. ChatGPT-4 has limited success in improving readability without compromising accuracy and usefulness. We urge AAOS to enhance the readability of these materials and recommend physicians use them as supplemental resources while prioritizing direct patient education for Spanish-speaking individuals. Further research is needed to develop readable and culturally appropriate education materials for non-English-speaking patients that incorporate direct patient feedback.</p><p><strong>Clinical relevance: </strong>This study shows that Spanish-language orthopaedic materials often exceed recommended readability levels, limiting their effectiveness and worsening health disparities. While AI tools like ChatGPT-4 improve readability, they may fall short in accuracy and usefulness. This underscores the need for clearer, culturally appropriate materials and the importance of physicians providing direct education.</p>","PeriodicalId":10404,"journal":{"name":"Clinical Orthopaedics and Related Research®","volume":" ","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Orthopaedics and Related Research®","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/CORR.0000000000003413","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The more than 41 million people in the United States who speak Spanish represent one of the fastest-growing US populations. Non-English-speaking patients often face poorer health outcomes because of language barriers that hinder patient education. Orthopaedic education materials have limited availability in Spanish and may be difficult for some patients to read. The American Academy of Orthopaedic Surgeons (AAOS) has translated education materials into Spanish, but their readability levels remain unknown. Additionally, although artificial intelligence (AI) dialogue platforms have been shown to improve readability in English, no studies have specifically evaluated their effectiveness in non-English languages.

Questions/purposes: (1) What is the readability of AAOS Spanish-language education materials? (2) Can an AI dialogue platform improve the readability of Spanish-language education materials while maintaining their accuracy and usefulness?

Methods: After excluding COVID-19 articles and inaccessible websites, Spanish-language education materials were extracted from the AAOS OrthoInfo website, and their Fernández-Huerta and Spanish Orthographic Length (SOL) readability grade levels were calculated. Fernández-Huerta focuses on syntactic complexity (sentence and syllable structure) and SOL assesses lexical complexity (word length and frequency). For both, the higher the grade level, the harder it is to read. Education materials with a reading level above the sixth-grade level were inputted into the ChatGPT-4 AI platform to be adapted to a fifth-grade level. Readability metrics of the adaptations were reassessed and compared with the original versions. Secondarily, one of four Spanish-speaking orthopaedic surgeons evaluated each AI-adapted education material for accuracy and usefulness compared with the original version. We used a single review per material, trusting the orthopaedic surgeon's expertise to minimize discrepancies. We included a total of 77 of 82 education materials covering topics like diseases and conditions, treatment, and recovery and staying healthy.

Results: Before AI adaptations, none of the 77 education materials met the recommended reading level of sixth grade or below according to both readability formulas. The original education materials were written at a seventh- to eighth-grade reading level in 32% of cases (25 of 77). In comparison, after a single attempt at simplification, AI-adapted materials achieved this reading level in 53% of cases (41 of 77; p < 0.001). Only 23% (18) and 16% (12) of the AI adaptations were written at or below the recommended sixth-grade level per the Fernández-Huerta and SOL grade levels, respectively. Of the AI adaptations, 52% (40) were rated as accurate and 56% (43) were rated as useful for patient education by the evaluating orthopaedic surgeons. AI adaptations that were classified as accurate or useful had a higher median (IQR) word count than those that were inaccurate (accurate 255 [216 to 331] versus inaccurate 236 [209 to 256]; p = 0.04) or not useful (useful 257 [216 to 337] versus not useful 233 [209 to 251]; p = 0.01).

Conclusion: Ongoing attention is needed to improve the readability of Spanish education materials to reduce health disparities. ChatGPT-4 has limited success in improving readability without compromising accuracy and usefulness. We urge AAOS to enhance the readability of these materials and recommend physicians use them as supplemental resources while prioritizing direct patient education for Spanish-speaking individuals. Further research is needed to develop readable and culturally appropriate education materials for non-English-speaking patients that incorporate direct patient feedback.

Clinical relevance: This study shows that Spanish-language orthopaedic materials often exceed recommended readability levels, limiting their effectiveness and worsening health disparities. While AI tools like ChatGPT-4 improve readability, they may fall short in accuracy and usefulness. This underscores the need for clearer, culturally appropriate materials and the importance of physicians providing direct education.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.00
自引率
11.90%
发文量
722
审稿时长
2.5 months
期刊介绍: Clinical Orthopaedics and Related Research® is a leading peer-reviewed journal devoted to the dissemination of new and important orthopaedic knowledge. CORR® brings readers the latest clinical and basic research, along with columns, commentaries, and interviews with authors.
期刊最新文献
What Are the Minimum Clinically Important Difference, Substantial Clinical Benefit, and Patient-Acceptable Symptom State Thresholds for the Modified Harris Hip Score and International Hip Outcome Tool 12 Among Patients Who Undergo Periacetabular Osteotomy? Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials. CORR Insights®: How Is Preoperative Opioid Use Associated With Readmissions and Outcomes in Lower Extremity Trauma? CORR Insights®: Long Head of Biceps Tendinopathy Is Associated With Age and Cuff Tendinopathy on MRI Obtained for Evaluation of Shoulder Pain. CORR Insights®: Risk-stratified Care Improves Pain-related Knowledge and Reduces Psychological Distress for Low Back Pain: A Secondary Analysis of a Randomized Trial.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1