ChatGPT-4 Generates More Accurate and Complete Responses to Common Patient Questions About Anterior Cruciate Ligament Reconstruction Than Google’s Search Engine

Michael A. Gaudiani M.D. , Joshua P. Castle M.D. , Muhammad J. Abbas M.D. , Brittaney A. Pratt B.S. , Marquisha D. Myles B.S. , Vasilios Moutzouros M.D. , T. Sean Lynch M.D.
{"title":"ChatGPT-4 Generates More Accurate and Complete Responses to Common Patient Questions About Anterior Cruciate Ligament Reconstruction Than Google’s Search Engine","authors":"Michael A. Gaudiani M.D. ,&nbsp;Joshua P. Castle M.D. ,&nbsp;Muhammad J. Abbas M.D. ,&nbsp;Brittaney A. Pratt B.S. ,&nbsp;Marquisha D. Myles B.S. ,&nbsp;Vasilios Moutzouros M.D. ,&nbsp;T. Sean Lynch M.D.","doi":"10.1016/j.asmr.2024.100939","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>To replicate a patient’s internet search to evaluate ChatGPT’s appropriateness in answering common patient questions about anterior cruciate ligament reconstruction compared with a Google web search.</p></div><div><h3>Methods</h3><p>A Google web search was performed by searching the term “anterior cruciate ligament reconstruction.” The top 20 frequently asked questions and responses were recorded. The prompt “What are the 20 most popular patient questions related to ‘anterior cruciate ligament reconstruction?’” was input into ChatGPT and questions and responses were recorded. Questions were classified based on the Rothwell system and responses assessed via Flesch-Kincaid Grade Level, correctness, and completeness were for both Google web search and ChatGPT.</p></div><div><h3>Results</h3><p>Three of 20 (15%) questions were similar between Google web search and ChatGPT. The most common question types among the Google web search were value (8/20, 40%), fact (7/20, 35%), and policy (5/20, 25%). The most common question types amongst the ChatGPT search were fact (12/20, 60%), policy (6/20, 30%), and value (2/20, 10%). Mean Flesch-Kincaid Grade Level for Google web search responses was significantly lower (11.8 ± 3.8 vs 14.3 ± 2.2; <em>P</em> = .003) than for ChatGPT responses. The mean correctness for Google web search question answers was 1.47 ± 0.5, and mean completeness was 1.36 ± 0.5. Mean correctness for ChatGPT answers was 1.8 ± 0.4 and mean completeness was 1.9 ± 0.3, which were both significantly greater than Google web search answers (<em>P</em> = .03 and <em>P</em> = .0003).</p></div><div><h3>Conclusions</h3><p>ChatGPT-4 generated more accurate and complete responses to common patient questions about anterior cruciate ligament reconstruction than Google’s search engine.</p></div><div><h3>Clinical Relevance</h3><p>The use of artificial intelligence such as ChatGPT is expanding. It is important to understand the quality of information as well as how the results of ChatGPT queries compare with those from Google web searches</p></div>","PeriodicalId":34631,"journal":{"name":"Arthroscopy Sports Medicine and Rehabilitation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666061X24000579/pdfft?md5=d7ac14145b1db8d87e6374fcf43f0d64&pid=1-s2.0-S2666061X24000579-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arthroscopy Sports Medicine and Rehabilitation","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666061X24000579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

To replicate a patient’s internet search to evaluate ChatGPT’s appropriateness in answering common patient questions about anterior cruciate ligament reconstruction compared with a Google web search.

Methods

A Google web search was performed by searching the term “anterior cruciate ligament reconstruction.” The top 20 frequently asked questions and responses were recorded. The prompt “What are the 20 most popular patient questions related to ‘anterior cruciate ligament reconstruction?’” was input into ChatGPT and questions and responses were recorded. Questions were classified based on the Rothwell system and responses assessed via Flesch-Kincaid Grade Level, correctness, and completeness were for both Google web search and ChatGPT.

Results

Three of 20 (15%) questions were similar between Google web search and ChatGPT. The most common question types among the Google web search were value (8/20, 40%), fact (7/20, 35%), and policy (5/20, 25%). The most common question types amongst the ChatGPT search were fact (12/20, 60%), policy (6/20, 30%), and value (2/20, 10%). Mean Flesch-Kincaid Grade Level for Google web search responses was significantly lower (11.8 ± 3.8 vs 14.3 ± 2.2; P = .003) than for ChatGPT responses. The mean correctness for Google web search question answers was 1.47 ± 0.5, and mean completeness was 1.36 ± 0.5. Mean correctness for ChatGPT answers was 1.8 ± 0.4 and mean completeness was 1.9 ± 0.3, which were both significantly greater than Google web search answers (P = .03 and P = .0003).

Conclusions

ChatGPT-4 generated more accurate and complete responses to common patient questions about anterior cruciate ligament reconstruction than Google’s search engine.

Clinical Relevance

The use of artificial intelligence such as ChatGPT is expanding. It is important to understand the quality of information as well as how the results of ChatGPT queries compare with those from Google web searches

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
与谷歌搜索引擎相比,ChatGPT-4 能更准确、更完整地回答患者关于前十字韧带重建的常见问题
目的通过复制患者的网络搜索,评估 ChatGPT 与谷歌网络搜索相比,在回答患者关于前交叉韧带重建的常见问题方面的适当性。方法在谷歌网络搜索中搜索 "前交叉韧带重建"。记录了前 20 个常见问题和回答。在 ChatGPT 中输入 "与'前交叉韧带重建'相关的 20 个最常见的患者问题是什么?"的提示,并记录问题和回答。根据 Rothwell 系统对问题进行分类,并通过 Flesch-Kincaid 分级、正确性和完整性对谷歌网页搜索和 ChatGPT 的回答进行评估。谷歌网络搜索中最常见的问题类型是价值(8/20,40%)、事实(7/20,35%)和政策(5/20,25%)。在 ChatGPT 搜索中,最常见的问题类型是事实(12/20,60%)、政策(6/20,30%)和价值(2/20,10%)。Google 网页搜索回答的平均 Flesch-Kincaid 等级水平(11.8 ± 3.8 vs 14.3 ± 2.2; P = .003)明显低于 ChatGPT 回答。谷歌网络搜索问题答案的平均正确率为 1.47 ± 0.5,平均完整率为 1.36 ± 0.5。结论与谷歌搜索引擎相比,ChatGPT-4 为患者提供的有关前交叉韧带重建的常见问题的回答更准确、更完整。临床相关性ChatGPT 等人工智能的应用正在不断扩大。了解信息的质量以及 ChatGPT 查询结果与谷歌网络搜索结果的比较非常重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.70
自引率
0.00%
发文量
218
审稿时长
45 weeks
期刊最新文献
Continuous Meniscal Repair Technique Allows for Shorter Operative Time and Learning Curve Compared With Traditional Vertical Mattress Technique in Controlled Arthroscopic Training in Porcine Model Concomitant Popliteomeniscal Fascicles Tears Are Found in 21% of Professional Soccer Players With Acute Anterior Cruciate Ligament Injuries Mini-Open Technique for Gluteus Medius Tendon Repairs Is Associated With Low Complication Rates and Sustained Improvement in Patient Reported Outcomes at 2-Year Follow-Up The Top-20 Studies About Anterior Shoulder Instability From an Altmetric Analysis Had Higher Levels of Evidence Than Those From a Traditional Bibliometric Analysis Medial Patellofemoral Ligament Augmented With a Reinforced Bioinductive Implant Is Biomechanically Similar to the Native Medial Patellofemoral Ligament at Time Zero in a Cadaveric Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1