Enhancing patient education on the role of tibial osteotomy in the management of knee osteoarthritis using a customized ChatGPT: a readability and quality assessment.

IF 3.2 Q1 HEALTH CARE SCIENCES & SERVICES Frontiers in digital health Pub Date : 2025-01-03 eCollection Date: 2024-01-01 DOI:10.3389/fdgth.2024.1480381

Stephen Fahy, Stephan Oehme, Danko Dan Milinkovic, Benjamin Bartek

{"title":"Enhancing patient education on the role of tibial osteotomy in the management of knee osteoarthritis using a customized ChatGPT: a readability and quality assessment.","authors":"Stephen Fahy, Stephan Oehme, Danko Dan Milinkovic, Benjamin Bartek","doi":"10.3389/fdgth.2024.1480381","DOIUrl":null,"url":null,"abstract":"Introduction: Knee osteoarthritis (OA) significantly impacts the quality of life of those afflicted, with many patients eventually requiring surgical intervention. While Total Knee Arthroplasty (TKA) is common, it may not be suitable for younger patients with unicompartmental OA, who might benefit more from High Tibial Osteotomy (HTO). Effective patient education is crucial for informed decision-making, yet most online health information has been found to be too complex for the average patient to understand. AI tools like ChatGPT may offer a solution, but their outputs often exceed the public's literacy level. This study assessed whether a customised ChatGPT could be utilized to improve readability and source accuracy in patient education on Knee OA and tibial osteotomy.Methods: Commonly asked questions about HTO were gathered using Google's \"People Also Asked\" feature and formatted to an 8th-grade reading level. Two ChatGPT-4 models were compared: a native version and a fine-tuned model (\"The Knee Guide\") optimized for readability and source citation through Instruction-Based Fine-Tuning (IBFT) and Reinforcement Learning from Human Feedback (RLHF). The responses were evaluated for quality using the DISCERN criteria and readability using the Flesch Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).Results: The native ChatGPT-4 model scored a mean DISCERN score of 38.41 (range 25-46), indicating poor quality, while \"The Knee Guide\" scored 45.9 (range 33-66), indicating moderate quality. Cronbach's Alpha was 0.86, indicating good interrater reliability. \"The Knee Guide\" achieved better readability with a mean FKGL of 8.2 (range 5-10.7, ±1.42) and a mean FRES of 60 (range 47-76, ±7.83), compared to the native model's FKGL of 13.9 (range 11-16, ±1.39) and FRES of 32 (range 14-47, ±8.3). These differences were statistically significant (p < 0.001).Conclusions: Fine-tuning ChatGPT significantly improved the readability and quality of HTO-related information. \"The Knee Guide\" demonstrated the potential of customized AI tools in enhancing patient education by making complex medical information more accessible and understandable.","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"6 ","pages":"1480381"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11738919/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdgth.2024.1480381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Knee osteoarthritis (OA) significantly impacts the quality of life of those afflicted, with many patients eventually requiring surgical intervention. While Total Knee Arthroplasty (TKA) is common, it may not be suitable for younger patients with unicompartmental OA, who might benefit more from High Tibial Osteotomy (HTO). Effective patient education is crucial for informed decision-making, yet most online health information has been found to be too complex for the average patient to understand. AI tools like ChatGPT may offer a solution, but their outputs often exceed the public's literacy level. This study assessed whether a customised ChatGPT could be utilized to improve readability and source accuracy in patient education on Knee OA and tibial osteotomy.

Methods: Commonly asked questions about HTO were gathered using Google's "People Also Asked" feature and formatted to an 8th-grade reading level. Two ChatGPT-4 models were compared: a native version and a fine-tuned model ("The Knee Guide") optimized for readability and source citation through Instruction-Based Fine-Tuning (IBFT) and Reinforcement Learning from Human Feedback (RLHF). The responses were evaluated for quality using the DISCERN criteria and readability using the Flesch Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).

Results: The native ChatGPT-4 model scored a mean DISCERN score of 38.41 (range 25-46), indicating poor quality, while "The Knee Guide" scored 45.9 (range 33-66), indicating moderate quality. Cronbach's Alpha was 0.86, indicating good interrater reliability. "The Knee Guide" achieved better readability with a mean FKGL of 8.2 (range 5-10.7, ±1.42) and a mean FRES of 60 (range 47-76, ±7.83), compared to the native model's FKGL of 13.9 (range 11-16, ±1.39) and FRES of 32 (range 14-47, ±8.3). These differences were statistically significant (p < 0.001).

Conclusions: Fine-tuning ChatGPT significantly improved the readability and quality of HTO-related information. "The Knee Guide" demonstrated the potential of customized AI tools in enhancing patient education by making complex medical information more accessible and understandable.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用定制ChatGPT加强患者对胫骨截骨术在膝关节骨性关节炎治疗中的作用的教育：可读性和质量评估。

膝骨关节炎（OA）严重影响患者的生活质量，许多患者最终需要手术干预。虽然全膝关节置换术（TKA）是常见的，但它可能不适合年轻的单室OA患者，他们可能从胫骨高位截骨术（HTO）中获益更多。有效的患者教育对于知情决策至关重要，但大多数在线健康信息过于复杂，普通患者无法理解。像ChatGPT这样的人工智能工具可能会提供一个解决方案，但它们的产出往往超出了公众的文化水平。本研究评估了定制ChatGPT是否可以用于提高膝关节OA和胫骨截骨患者教育的可读性和来源准确性。方法：使用b谷歌的“人们也被问到”功能收集关于HTO的常见问题，并格式化为8年级阅读水平。比较了两种ChatGPT-4模型：原生版本和微调模型（“膝关节指南”），通过基于指令的微调（IBFT）和基于人类反馈的强化学习（RLHF）优化了可读性和来源引用。使用DISCERN标准评估回复的质量，使用Flesch Reading Ease Score （FRES）和Flesch- kincaid Grade Level （FKGL）评估回复的可读性。结果：本地ChatGPT-4模型的平均DISCERN评分为38.41分（范围25-46），表明质量较差，而“The Knee Guide”的平均评分为45.9分（范围33-66），表明质量中等。Cronbach’s Alpha值为0.86，表明具有良好的互信度。“膝关节指南”的平均FKGL为8.2（范围5-10.7，±1.42），平均FRES为60（范围47-76，±7.83），与本地模型的FKGL 13.9（范围11-16，±1.39）和FRES 32（范围14-47，±8.3）相比，具有更好的可读性。这些差异具有统计学意义(p)。结论：微调ChatGPT可显著提高hto相关信息的可读性和质量。“膝关节指南”展示了定制人工智能工具的潜力，通过使复杂的医疗信息更容易获取和理解，加强患者教育。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊