Catherine Hand, Camden Bohn, Shadia Tannir, Marisa Ulrich, Sami Saniei, Miguel Girod-Hoffman, Yining Lu, Brian Forsythe
{"title":"American Academy of Orthopedic Surgery OrthoInfo provides more readable information regarding rotator cuff injury than ChatGPT.","authors":"Catherine Hand, Camden Bohn, Shadia Tannir, Marisa Ulrich, Sami Saniei, Miguel Girod-Hoffman, Yining Lu, Brian Forsythe","doi":"10.1016/j.jisako.2025.100841","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>With over 61% of Americans seeking health information online, the accuracy and readability of this content are critical. AI tools, like ChatGPT, have gained popularity in providing medical information, but concerns remain about their accessibility, especially for individuals with lower literacy levels. This study compares the readability and accuracy of ChatGPT-generated content with information from the American Academy of Orthopedic Surgery (AAOS) OrthoInfo website, focusing on rotator cuff injuries.</p><p><strong>Methods: </strong>We formulated seven frequently asked questions about rotator cuff injuries, based on the OrthoInfo website, and gathered responses from both ChatGPT-4 and OrthoInfo. Readability was assessed using multiple readability metrics (Flesch-Kincaid, Gunning Fog, Coleman-Liau, SMOG Readability Formula, FORCAST Readability Formula, Fry Graph, Raygor Readability Estimate), while accuracy was evaluated by three independent reviewers. Statistical analysis included t-tests and correlation analysis.</p><p><strong>Results: </strong>ChatGPT responses required a higher education level to comprehend, with an average grade level of 14.7, compared to OrthoInfo's 11.9 (p < 0.01). The Flesch Reading Ease Index indicated that OrthoInfo's content (52.5) was more readable than ChatGPT's (25.9, p < 0.01). Both sources had high accuracy, with ChatGPT slightly lower in accuracy for the question about further damage to the rotator cuff (p < 0.05).</p><p><strong>Conclusion: </strong>ChatGPT shows promise in delivering accurate health information but may not be suitable for all patients due to its higher complexity. A combination of AI and expert-reviewed, accessible content may enhance patient understanding and health literacy. Future developments should focus on improving AI's adaptability to different literacy levels.</p><p><strong>Level of evidence: </strong>IV.</p>","PeriodicalId":36847,"journal":{"name":"Journal of ISAKOS Joint Disorders & Orthopaedic Sports Medicine","volume":" ","pages":"100841"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ISAKOS Joint Disorders & Orthopaedic Sports Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jisako.2025.100841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: With over 61% of Americans seeking health information online, the accuracy and readability of this content are critical. AI tools, like ChatGPT, have gained popularity in providing medical information, but concerns remain about their accessibility, especially for individuals with lower literacy levels. This study compares the readability and accuracy of ChatGPT-generated content with information from the American Academy of Orthopedic Surgery (AAOS) OrthoInfo website, focusing on rotator cuff injuries.
Methods: We formulated seven frequently asked questions about rotator cuff injuries, based on the OrthoInfo website, and gathered responses from both ChatGPT-4 and OrthoInfo. Readability was assessed using multiple readability metrics (Flesch-Kincaid, Gunning Fog, Coleman-Liau, SMOG Readability Formula, FORCAST Readability Formula, Fry Graph, Raygor Readability Estimate), while accuracy was evaluated by three independent reviewers. Statistical analysis included t-tests and correlation analysis.
Results: ChatGPT responses required a higher education level to comprehend, with an average grade level of 14.7, compared to OrthoInfo's 11.9 (p < 0.01). The Flesch Reading Ease Index indicated that OrthoInfo's content (52.5) was more readable than ChatGPT's (25.9, p < 0.01). Both sources had high accuracy, with ChatGPT slightly lower in accuracy for the question about further damage to the rotator cuff (p < 0.05).
Conclusion: ChatGPT shows promise in delivering accurate health information but may not be suitable for all patients due to its higher complexity. A combination of AI and expert-reviewed, accessible content may enhance patient understanding and health literacy. Future developments should focus on improving AI's adaptability to different literacy levels.