[Efficacy and safety of artificial intelligence-based large language models for decision making support in herniology: evaluation by experts and general surgeons].

Q4 Medicine Khirurgiya Pub Date : 2024-01-01 DOI:10.17116/hirurgia20240816
T V Nechay, A V Sazhin, K M Loban, A K Bogomolova, V V Suglob, T R Beniia
{"title":"[Efficacy and safety of artificial intelligence-based large language models for decision making support in herniology: evaluation by experts and general surgeons].","authors":"T V Nechay, A V Sazhin, K M Loban, A K Bogomolova, V V Suglob, T R Beniia","doi":"10.17116/hirurgia20240816","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the quality of recommendations provided by ChatGPT regarding inguinal hernia repair.</p><p><strong>Material and methods: </strong>ChatGPT was asked 5 questions about surgical management of inguinal hernias. The chat-bot was assigned the role of expert in herniology and requested to search only specialized medical databases and provide information about references and evidence. Herniology experts and surgeons (non-experts) rated the quality of recommendations generated by ChatGPT using 4-point scale (from 0 to 3 points). Statistical correlations were explored between participants' ratings and their stance regarding artificial intelligence.</p><p><strong>Results: </strong>Experts scored the quality of ChatGPT responses lower than non-experts (2 (1-2) vs. 2 (2-3), <i>p</i><0.001). The chat-bot failed to provide valid references and actual evidence, as well as falsified half of references. Respondents were optimistic about the future of neural networks for clinical decision-making support. Most of them were against restricting their use in healthcare.</p><p><strong>Conclusion: </strong>We would not recommend non-specialized large language models as a single or primary source of information for clinical decision making or virtual searching assistant.</p>","PeriodicalId":35986,"journal":{"name":"Khirurgiya","volume":" 8","pages":"6-14"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Khirurgiya","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17116/hirurgia20240816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To evaluate the quality of recommendations provided by ChatGPT regarding inguinal hernia repair.

Material and methods: ChatGPT was asked 5 questions about surgical management of inguinal hernias. The chat-bot was assigned the role of expert in herniology and requested to search only specialized medical databases and provide information about references and evidence. Herniology experts and surgeons (non-experts) rated the quality of recommendations generated by ChatGPT using 4-point scale (from 0 to 3 points). Statistical correlations were explored between participants' ratings and their stance regarding artificial intelligence.

Results: Experts scored the quality of ChatGPT responses lower than non-experts (2 (1-2) vs. 2 (2-3), p<0.001). The chat-bot failed to provide valid references and actual evidence, as well as falsified half of references. Respondents were optimistic about the future of neural networks for clinical decision-making support. Most of them were against restricting their use in healthcare.

Conclusion: We would not recommend non-specialized large language models as a single or primary source of information for clinical decision making or virtual searching assistant.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
[基于人工智能的大语言模型在疝气学决策支持中的有效性和安全性:专家和普通外科医生的评估]。
目的:评估 ChatGPT 提供的腹股沟疝修补建议的质量:评估 ChatGPT 提供的腹股沟疝修补建议的质量:聊天机器人被问了 5 个关于腹股沟疝手术治疗的问题。聊天机器人被指定为疝气学专家,要求其仅搜索专业医学数据库并提供有关参考文献和证据的信息。疝气学专家和外科医生(非专家)使用 4 级评分法(从 0 到 3 分)对聊天机器人生成的建议质量进行评分。研究人员探讨了参与者的评分与其对人工智能的立场之间的统计相关性:结果:专家对 ChatGPT 回复质量的评分低于非专家(2 (1-2) vs. 2 (2-3),pConclusion):我们不建议将非专业的大型语言模型作为临床决策或虚拟搜索助手的单一或主要信息来源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Khirurgiya
Khirurgiya Medicine-Medicine (all)
CiteScore
0.70
自引率
0.00%
发文量
161
期刊介绍: Хирургия отдельных областей сердце, сосуды легкие пищевод молочная железа желудок и двенадцатиперстная кишка кишечник желчевыводящие пути печень
期刊最新文献
[Endoscopic stenting for malignant pancreatobiliary strictures]. [Ankle replacement for severe post-traumatic deformation of the distal tibia: a case report]. [Comparative analysis of in-hospital and long-term results of patients with acute dysfunction of coronary bypass grafts depending on treatment tactics]. [Efficacy and safety of surgical treatment of patients with pathological tortuosity of the internal carotid artery]. [Endoscopic vacuum therapy in minimally invasive treatment of esophageal perforations].
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1