Chat Generative Pre-Trained Transformer (ChatGPT) - 3.5 Responses Require Advanced Readability for the General Population and May Not Effectively Supplement Patient-Related Information Provided by the Treating Surgeon Regarding Common Questions About Rotator Cuff Repair.

IF 4.4 1区医学 Q1 ORTHOPEDICS Arthroscopy-The Journal of Arthroscopic and Related Surgery Pub Date : 2025-01-01 Epub Date: 2024-05-21 DOI:10.1016/j.arthro.2024.05.009

Emma Eng, Colton Mowers, Divesh Sachdev, Payton Yerke-Hansen, Garrett R Jackson, Derrick M Knapik, Vani J Sabesan

{"title":"Chat Generative Pre-Trained Transformer (ChatGPT) - 3.5 Responses Require Advanced Readability for the General Population and May Not Effectively Supplement Patient-Related Information Provided by the Treating Surgeon Regarding Common Questions About Rotator Cuff Repair.","authors":"Emma Eng, Colton Mowers, Divesh Sachdev, Payton Yerke-Hansen, Garrett R Jackson, Derrick M Knapik, Vani J Sabesan","doi":"10.1016/j.arthro.2024.05.009","DOIUrl":null,"url":null,"abstract":"Purpose: To investigate the accuracy of Chat Generative Pre-Trained Transformer (ChatGPT)'s responses to frequently asked questions prior to rotator cuff repair surgery.Methods: The 10 most common frequently asked questions related to rotator cuff repair were compiled from 4 institution websites. Questions were then input into ChatGPT-3.5 in 1 session. The provided ChatGPT-3.5 responses were analyzed by 2 orthopaedic surgeons for reliability, quality, and readability using the Journal of the American Medical Association Benchmark criteria, the DISCERN score, and the Flesch-Kincaid Grade Level.Results: The Journal of the American Medical Association Benchmark criteria score was 0, indicating the absence of reliable source material citations. The mean Flesch-Kincaid Grade Level was 13.4 (range, 11.2-15.0). The mean DISCERN score was 43.4 (range, 36-51), indicating that the quality of the responses overall was considered fair. All responses cited making final decision-making to be made with the treating physician.Conclusions: ChatGPT-3.5 provided substandard patient-related information in alignment with recommendations from the treating surgeon regarding common questions around rotator cuff repair surgery. Additionally, the responses lacked reliable source material citations, and the readability of the responses was relatively advanced with a complex language style.Clinical relevance: The findings of this study suggest that ChatGPT-3.5 may not effectively supplement patient-related information in the context of recommendations provided by the treating surgeon prior to rotator cuff repair surgery.","PeriodicalId":55459,"journal":{"name":"Arthroscopy-The Journal of Arthroscopic and Related Surgery","volume":" ","pages":"42-52"},"PeriodicalIF":4.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arthroscopy-The Journal of Arthroscopic and Related Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.arthro.2024.05.009","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/21 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: To investigate the accuracy of Chat Generative Pre-Trained Transformer (ChatGPT)'s responses to frequently asked questions prior to rotator cuff repair surgery.

Methods: The 10 most common frequently asked questions related to rotator cuff repair were compiled from 4 institution websites. Questions were then input into ChatGPT-3.5 in 1 session. The provided ChatGPT-3.5 responses were analyzed by 2 orthopaedic surgeons for reliability, quality, and readability using the Journal of the American Medical Association Benchmark criteria, the DISCERN score, and the Flesch-Kincaid Grade Level.

Results: The Journal of the American Medical Association Benchmark criteria score was 0, indicating the absence of reliable source material citations. The mean Flesch-Kincaid Grade Level was 13.4 (range, 11.2-15.0). The mean DISCERN score was 43.4 (range, 36-51), indicating that the quality of the responses overall was considered fair. All responses cited making final decision-making to be made with the treating physician.

Conclusions: ChatGPT-3.5 provided substandard patient-related information in alignment with recommendations from the treating surgeon regarding common questions around rotator cuff repair surgery. Additionally, the responses lacked reliable source material citations, and the readability of the responses was relatively advanced with a complex language style.

Clinical relevance: The findings of this study suggest that ChatGPT-3.5 may not effectively supplement patient-related information in the context of recommendations provided by the treating surgeon prior to rotator cuff repair surgery.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

尽管 ChatGPT-3.5 对于普通人群来说可读性较高，但它能有效补充外科医生就肩袖修复术常见问题提供的患者相关信息。

目的：本研究旨在调查 ChatGPT 对肩袖修复手术前常见问题（FAQ）回答的准确性：方法：我们从四个机构的网站上收集了十个最常见的肩袖修复常见问题。然后将问题一次性输入 ChatGPT-3.5。两位骨科医生使用《美国医学会杂志》（JAMA）基准标准、DISCERN 评分和 Flesch-Kincaid 分级对所提供的 ChatGPT-3.5 回答的可靠性、质量和可读性进行了分析：结果：《美国医学会杂志》基准标准得分为 0，表明缺乏可靠的源材料引文。Flesch-Kincaid 等级平均值为 13.4（范围为 11.2-15.0）。DISCERN 的平均得分为 43.4 分（范围为 36-51），表明答复的总体质量尚可。所有回复都指出最终决策应由主治医生做出：结论：ChatGPT-3.5 提供的患者相关信息不符合主治医生关于肩袖修复手术常见问题的建议。此外，回复缺乏可靠的源材料引用，回复的可读性相对较高，语言风格复杂：本研究结果表明，ChatGPT-3.5 可能无法在肩袖修复手术前根据主治医生提供的建议有效补充患者相关信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Arthroscopy-The Journal of Arthroscopic and Related Surgery 医学-外科

CiteScore

9.30

自引率

17.00%

发文量

555

审稿时长

58 days

期刊介绍： Nowhere is minimally invasive surgery explained better than in Arthroscopy, the leading peer-reviewed journal in the field. Every issue enables you to put into perspective the usefulness of the various emerging arthroscopic techniques. The advantages and disadvantages of these methods -- along with their applications in various situations -- are discussed in relation to their efficiency, efficacy and cost benefit. As a special incentive, paid subscribers also receive access to the journal expanded website.

期刊最新文献

Corrigendum Corrigendum Announcements Editorial Board Masthead