Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.

IF 4.4 2区 医学 Q1 ORTHOPEDICS Clinical Orthopaedics and Related Research® Pub Date : 2025-02-11 DOI:10.1097/CORR.0000000000003413
Rodnell Busigó Torres, Mariana Restrepo, Brocha Z Stern, B Israel Yahuaca, Rafael A Buerba, Ivan A García, Victor H Hernandez, Ronald A Navarro
{"title":"Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.","authors":"Rodnell Busigó Torres, Mariana Restrepo, Brocha Z Stern, B Israel Yahuaca, Rafael A Buerba, Ivan A García, Victor H Hernandez, Ronald A Navarro","doi":"10.1097/CORR.0000000000003413","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The more than 41 million people in the United States who speak Spanish represent one of the fastest-growing US populations. Non-English-speaking patients often face poorer health outcomes because of language barriers that hinder patient education. Orthopaedic education materials have limited availability in Spanish and may be difficult for some patients to read. The American Academy of Orthopaedic Surgeons (AAOS) has translated education materials into Spanish, but their readability levels remain unknown. Additionally, although artificial intelligence (AI) dialogue platforms have been shown to improve readability in English, no studies have specifically evaluated their effectiveness in non-English languages.</p><p><strong>Questions/purposes: </strong>(1) What is the readability of AAOS Spanish-language education materials? (2) Can an AI dialogue platform improve the readability of Spanish-language education materials while maintaining their accuracy and usefulness?</p><p><strong>Methods: </strong>After excluding COVID-19 articles and inaccessible websites, Spanish-language education materials were extracted from the AAOS OrthoInfo website, and their Fernández-Huerta and Spanish Orthographic Length (SOL) readability grade levels were calculated. Fernández-Huerta focuses on syntactic complexity (sentence and syllable structure) and SOL assesses lexical complexity (word length and frequency). For both, the higher the grade level, the harder it is to read. Education materials with a reading level above the sixth-grade level were inputted into the ChatGPT-4 AI platform to be adapted to a fifth-grade level. Readability metrics of the adaptations were reassessed and compared with the original versions. Secondarily, one of four Spanish-speaking orthopaedic surgeons evaluated each AI-adapted education material for accuracy and usefulness compared with the original version. We used a single review per material, trusting the orthopaedic surgeon's expertise to minimize discrepancies. We included a total of 77 of 82 education materials covering topics like diseases and conditions, treatment, and recovery and staying healthy.</p><p><strong>Results: </strong>Before AI adaptations, none of the 77 education materials met the recommended reading level of sixth grade or below according to both readability formulas. The original education materials were written at a seventh- to eighth-grade reading level in 32% of cases (25 of 77). In comparison, after a single attempt at simplification, AI-adapted materials achieved this reading level in 53% of cases (41 of 77; p < 0.001). Only 23% (18) and 16% (12) of the AI adaptations were written at or below the recommended sixth-grade level per the Fernández-Huerta and SOL grade levels, respectively. Of the AI adaptations, 52% (40) were rated as accurate and 56% (43) were rated as useful for patient education by the evaluating orthopaedic surgeons. AI adaptations that were classified as accurate or useful had a higher median (IQR) word count than those that were inaccurate (accurate 255 [216 to 331] versus inaccurate 236 [209 to 256]; p = 0.04) or not useful (useful 257 [216 to 337] versus not useful 233 [209 to 251]; p = 0.01).</p><p><strong>Conclusion: </strong>Ongoing attention is needed to improve the readability of Spanish education materials to reduce health disparities. ChatGPT-4 has limited success in improving readability without compromising accuracy and usefulness. We urge AAOS to enhance the readability of these materials and recommend physicians use them as supplemental resources while prioritizing direct patient education for Spanish-speaking individuals. Further research is needed to develop readable and culturally appropriate education materials for non-English-speaking patients that incorporate direct patient feedback.</p><p><strong>Clinical relevance: </strong>This study shows that Spanish-language orthopaedic materials often exceed recommended readability levels, limiting their effectiveness and worsening health disparities. While AI tools like ChatGPT-4 improve readability, they may fall short in accuracy and usefulness. This underscores the need for clearer, culturally appropriate materials and the importance of physicians providing direct education.</p>","PeriodicalId":10404,"journal":{"name":"Clinical Orthopaedics and Related Research®","volume":" ","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12189979/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Orthopaedics and Related Research®","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/CORR.0000000000003413","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The more than 41 million people in the United States who speak Spanish represent one of the fastest-growing US populations. Non-English-speaking patients often face poorer health outcomes because of language barriers that hinder patient education. Orthopaedic education materials have limited availability in Spanish and may be difficult for some patients to read. The American Academy of Orthopaedic Surgeons (AAOS) has translated education materials into Spanish, but their readability levels remain unknown. Additionally, although artificial intelligence (AI) dialogue platforms have been shown to improve readability in English, no studies have specifically evaluated their effectiveness in non-English languages.

Questions/purposes: (1) What is the readability of AAOS Spanish-language education materials? (2) Can an AI dialogue platform improve the readability of Spanish-language education materials while maintaining their accuracy and usefulness?

Methods: After excluding COVID-19 articles and inaccessible websites, Spanish-language education materials were extracted from the AAOS OrthoInfo website, and their Fernández-Huerta and Spanish Orthographic Length (SOL) readability grade levels were calculated. Fernández-Huerta focuses on syntactic complexity (sentence and syllable structure) and SOL assesses lexical complexity (word length and frequency). For both, the higher the grade level, the harder it is to read. Education materials with a reading level above the sixth-grade level were inputted into the ChatGPT-4 AI platform to be adapted to a fifth-grade level. Readability metrics of the adaptations were reassessed and compared with the original versions. Secondarily, one of four Spanish-speaking orthopaedic surgeons evaluated each AI-adapted education material for accuracy and usefulness compared with the original version. We used a single review per material, trusting the orthopaedic surgeon's expertise to minimize discrepancies. We included a total of 77 of 82 education materials covering topics like diseases and conditions, treatment, and recovery and staying healthy.

Results: Before AI adaptations, none of the 77 education materials met the recommended reading level of sixth grade or below according to both readability formulas. The original education materials were written at a seventh- to eighth-grade reading level in 32% of cases (25 of 77). In comparison, after a single attempt at simplification, AI-adapted materials achieved this reading level in 53% of cases (41 of 77; p < 0.001). Only 23% (18) and 16% (12) of the AI adaptations were written at or below the recommended sixth-grade level per the Fernández-Huerta and SOL grade levels, respectively. Of the AI adaptations, 52% (40) were rated as accurate and 56% (43) were rated as useful for patient education by the evaluating orthopaedic surgeons. AI adaptations that were classified as accurate or useful had a higher median (IQR) word count than those that were inaccurate (accurate 255 [216 to 331] versus inaccurate 236 [209 to 256]; p = 0.04) or not useful (useful 257 [216 to 337] versus not useful 233 [209 to 251]; p = 0.01).

Conclusion: Ongoing attention is needed to improve the readability of Spanish education materials to reduce health disparities. ChatGPT-4 has limited success in improving readability without compromising accuracy and usefulness. We urge AAOS to enhance the readability of these materials and recommend physicians use them as supplemental resources while prioritizing direct patient education for Spanish-speaking individuals. Further research is needed to develop readable and culturally appropriate education materials for non-English-speaking patients that incorporate direct patient feedback.

Clinical relevance: This study shows that Spanish-language orthopaedic materials often exceed recommended readability levels, limiting their effectiveness and worsening health disparities. While AI tools like ChatGPT-4 improve readability, they may fall short in accuracy and usefulness. This underscores the need for clearer, culturally appropriate materials and the importance of physicians providing direct education.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人工智能在提高西班牙语骨科患者教育材料的可读性水平方面取得了有限的成功。
背景:在美国有超过4100万人说西班牙语,这是美国增长最快的人口之一。由于语言障碍阻碍了患者的教育,非英语患者往往面临较差的健康结果。骨科教育材料在西班牙语的可用性有限,可能对一些患者难以阅读。美国骨科医师学会(AAOS)已将教育材料翻译成西班牙语,但其可读性尚不清楚。此外,尽管人工智能(AI)对话平台已被证明可以提高英语的可读性,但没有研究专门评估它们在非英语语言中的有效性。问题/目的:(1)美国科学促进会西班牙语教材的可读性如何?(2)人工智能对话平台能否在保持西班牙语教材准确性和实用性的同时提高其可读性?方法:在排除COVID-19文章和无法访问的网站后,从AAOS OrthoInfo网站提取西班牙语教材,计算其Fernández-Huerta和西班牙语Orthographic Length (SOL)可读性等级。Fernández-Huerta关注句法复杂性(句子和音节结构),而SOL评估词汇复杂性(单词长度和频率)。对于两者来说,等级越高,阅读就越困难。六年级以上阅读水平的教材输入ChatGPT-4人工智能平台,进行五年级水平的适配。重新评估改编的可读性指标,并与原始版本进行比较。其次,四名说西班牙语的骨科医生中的一名评估了每个人工智能改编的教育材料与原始版本的准确性和实用性。我们对每一种材料使用了单一的评价,相信整形外科医生的专业知识可以最大限度地减少差异。我们共收录了82种教育材料中的77种,涵盖了疾病和条件、治疗、康复和保持健康等主题。结果:在人工智能适配前,77本教材均未达到六年级及以下推荐阅读水平。在32%的案例中(77例中的25例),原始的教育材料是按照七年级到八年级的阅读水平编写的。相比之下,经过一次简化尝试,人工智能改编的材料在53%的情况下达到了这一阅读水平(77例中有41例;P < 0.001)。只有23%(18)和16%(12)的人工智能改编作品分别达到或低于Fernández-Huerta和SOL年级推荐的六年级水平。在人工智能适应中,52%(40)被评估骨科医生评为准确,56%(43)被评估骨科医生评为对患者教育有用。被归类为准确或有用的人工智能适应的中位数(IQR)字数计数高于不准确的(准确255[216至331]对不准确236[209至256]);P = 0.04)或无用(有用的257[216至337]对无用的233[209至251];P = 0.01)。结论:需要持续关注提高西班牙语教材的可读性,以减少健康差距。ChatGPT-4在不影响准确性和有用性的情况下提高可读性方面取得了有限的成功。我们敦促AAOS提高这些材料的可读性,并建议医生将其作为补充资源,同时优先考虑对讲西班牙语的个体进行直接患者教育。需要进一步研究为非英语患者开发可读且文化上合适的教育材料,并纳入患者的直接反馈。临床相关性:本研究表明,西班牙语骨科材料经常超过推荐的可读性水平,限制了它们的有效性,并加剧了健康差距。虽然ChatGPT-4等人工智能工具提高了可读性,但它们可能在准确性和实用性方面存在不足。这强调了需要更清晰、文化上合适的材料,以及医生提供直接教育的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.00
自引率
11.90%
发文量
722
审稿时长
2.5 months
期刊介绍: Clinical Orthopaedics and Related Research® is a leading peer-reviewed journal devoted to the dissemination of new and important orthopaedic knowledge. CORR® brings readers the latest clinical and basic research, along with columns, commentaries, and interviews with authors.
期刊最新文献
Women Are Unequally Represented Among Clinical Trial Leadership by Orthopaedic Subspecialty. Could the Scapular Spike Sign Be Used as a Radiographic Proxy for Surgical Indications? Severe Damage to the Ligamentous-Fossa-Foveolar Complex Is Common in Patients Undergoing Surgical Hip Dislocation for Femoroacetabular Impingement. Letter to the Editor: Not the Last Word: The Rational Calculus of Sports Injuries. What Predicts a Subsequent Skeletal-related Event, and How Do Factors Associated With Mortality Differ After an Initial Versus a Subsequent Event?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1