Readability and Appropriateness of Responses Generated by ChatGPT 3.5, ChatGPT 4.0, Gemini, and Microsoft Copilot for FAQs in Refractive Surgery.

Fahri Onur Aydın, Burakhan Kürşat Aksoy, Ali Ceylan, Yusuf Berk Akbaş, Serhat Ermiş, Burçin Kepez Yıldız, Yusuf Yıldırım
{"title":"Readability and Appropriateness of Responses Generated by ChatGPT 3.5, ChatGPT 4.0, Gemini, and Microsoft Copilot for FAQs in Refractive Surgery.","authors":"Fahri Onur Aydın, Burakhan Kürşat Aksoy, Ali Ceylan, Yusuf Berk Akbaş, Serhat Ermiş, Burçin Kepez Yıldız, Yusuf Yıldırım","doi":"10.4274/tjo.galenos.2024.28234","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To assess the appropriateness and readability of large language model (LLM) chatbots' answers to frequently asked questions about refractive surgery.</p><p><strong>Materials and methods: </strong>Four commonly used LLM chatbots were asked 40 questions frequently asked by patients about refractive surgery. The appropriateness of the answers was evaluated by 2 experienced refractive surgeons. Readability was evaluated with 5 different indexes.</p><p><strong>Results: </strong>Based on the responses generated by the LLM chatbots, 45% (n=18) of the answers given by ChatGPT 3.5 were correct, while this rate was 52.5% (n=21) for ChatGPT 4.0, 87.5% (n=35) for Gemini, and 60% (n=24) for Copilot. In terms of readability, it was observed that all LLM chatbots were very difficult to read and required a university degree.</p><p><strong>Conclusion: </strong>These LLM chatbots, which are finding a place in our daily lives, can occasionally provide inappropriate answers. Although all were difficult to read, Gemini was the most successful LLM chatbot in terms of generating appropriate answers and was relatively better in terms of readability.</p>","PeriodicalId":23373,"journal":{"name":"Turkish Journal of Ophthalmology","volume":"54 6","pages":"313-317"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707452/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Turkish Journal of Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4274/tjo.galenos.2024.28234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To assess the appropriateness and readability of large language model (LLM) chatbots' answers to frequently asked questions about refractive surgery.

Materials and methods: Four commonly used LLM chatbots were asked 40 questions frequently asked by patients about refractive surgery. The appropriateness of the answers was evaluated by 2 experienced refractive surgeons. Readability was evaluated with 5 different indexes.

Results: Based on the responses generated by the LLM chatbots, 45% (n=18) of the answers given by ChatGPT 3.5 were correct, while this rate was 52.5% (n=21) for ChatGPT 4.0, 87.5% (n=35) for Gemini, and 60% (n=24) for Copilot. In terms of readability, it was observed that all LLM chatbots were very difficult to read and required a university degree.

Conclusion: These LLM chatbots, which are finding a place in our daily lives, can occasionally provide inappropriate answers. Although all were difficult to read, Gemini was the most successful LLM chatbot in terms of generating appropriate answers and was relatively better in terms of readability.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT 3.5、ChatGPT 4.0、Gemini和Microsoft Copilot对屈光手术常见问题回答的可读性和适当性
目的:评估大语言模型(LLM)聊天机器人回答屈光手术常见问题的适当性和可读性。材料与方法:向4个常用的LLM聊天机器人询问40个患者关于屈光手术的常见问题。由2名经验丰富的屈光外科医生评估答案的适当性。用5个不同的指标评价可读性。结果:基于LLM聊天机器人生成的回答,ChatGPT 3.5给出的答案中有45% (n=18)是正确的,而ChatGPT 4.0的正确率为52.5% (n=21), Gemini的正确率为87.5% (n=35), Copilot的正确率为60% (n=24)。在可读性方面,我们观察到所有法学硕士聊天机器人都很难阅读,需要大学学位。结论:这些法学硕士聊天机器人在我们的日常生活中找到了一席之地,偶尔会提供不合适的答案。尽管所有这些都很难阅读,但Gemini在生成合适答案方面是最成功的LLM聊天机器人,在可读性方面也相对更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Turkish Journal of Ophthalmology
Turkish Journal of Ophthalmology Medicine-Ophthalmology
CiteScore
2.20
自引率
0.00%
发文量
0
期刊介绍: The Turkish Journal of Ophthalmology (TJO) is the only scientific periodical publication of the Turkish Ophthalmological Association and has been published since January 1929. In its early years, the journal was published in Turkish and French. Although there were temporary interruptions in the publication of the journal due to various challenges, the Turkish Journal of Ophthalmology has been published continually from 1971 to the present. The target audience includes specialists and physicians in training in ophthalmology in all relevant disciplines.
期刊最新文献
Report of a Rare Syndromic Retinal Dystrophy: Asphyxiating Thoracic Dystrophy (Jeune Syndrome). Readability and Appropriateness of Responses Generated by ChatGPT 3.5, ChatGPT 4.0, Gemini, and Microsoft Copilot for FAQs in Refractive Surgery. Reply. The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness. The Efficacy of Adalimumab Treatment in Pediatric Non-Infectious Uveitis: A Retrospective Cohort Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1