A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.

IF 2.2 3区 医学 Q2 SURGERY Updates in Surgery Pub Date : 2025-04-01 Epub Date: 2025-03-05 DOI:10.1007/s13304-025-02158-5
Meeran Banday, Kirat Kaur
{"title":"A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.","authors":"Meeran Banday, Kirat Kaur","doi":"10.1007/s13304-025-02158-5","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.</p>","PeriodicalId":23391,"journal":{"name":"Updates in Surgery","volume":" ","pages":"583-588"},"PeriodicalIF":2.2000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Updates in Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13304-025-02158-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

Abstract

This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一项横断面研究,旨在评估两款人工智能软件程序针对腹腔镜腹股沟疝修补术常见患者询问所生成的回复。
本研究旨在评估两个用户交互AI聊天机器人ChatGPT和ChatSonic在回答患者关于腹腔镜腹股沟疝修复的问题时所提供的回答的质量和准确性,并确定这些聊天机器人在解决患者关于腹股沟疝修复的问题时的适用性。针对腹腔镜下腹股沟疝修补术提出10个问题,并提交给ChatGPT 4.0和ChatSonic。反应由两名经验丰富的外科医生对来源进行盲测,使用全球质量评分(GQS)和改良的DISCERN评分来衡量反应质量和可靠性。ChatGPT对所有10个问题的高质量回答(GQS = 4 & 5),根据一个评估者,根据另一个评估者,对10个问题中的7个问题的高质量回答。同样,ChatGPT在一个评估者的9个回答和另一个评估者的3个回答上显示出高可靠性(DISCERN = 4和5),在GQS (kappa = 0.20)和修改后的DISCERN分数(kappa = 0.08)上,评估者之间只有轻微的一致性。ChatSonic也为大多数问题提供了高质量和可靠的回答,尽管程度低于ChatGPT,并且两者在回答中都显示出有限的一致性(p > 0.05)。总的来说,ChatGPT和ChatSonic在回答患者关于疝气手术的询问方面都展示了潜在的实用性。然而,由于可靠性和质量的不一致性,在广泛的临床应用之前,仍有必要对人工智能生成的医疗信息进行不断的改进和验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Updates in Surgery
Updates in Surgery Medicine-Surgery
CiteScore
4.50
自引率
7.70%
发文量
208
期刊介绍: Updates in Surgery (UPIS) has been founded in 2010 as the official journal of the Italian Society of Surgery. It’s an international, English-language, peer-reviewed journal dedicated to the surgical sciences. Its main goal is to offer a valuable update on the most recent developments of those surgical techniques that are rapidly evolving, forcing the community of surgeons to a rigorous debate and a continuous refinement of standards of care. In this respect position papers on the mostly debated surgical approaches and accreditation criteria have been published and are welcome for the future. Beside its focus on general surgery, the journal draws particular attention to cutting edge topics and emerging surgical fields that are publishing in monothematic issues guest edited by well-known experts. Updates in Surgery has been considering various types of papers: editorials, comprehensive reviews, original studies and technical notes related to specific surgical procedures and techniques on liver, colorectal, gastric, pancreatic, robotic and bariatric surgery.
期刊最新文献
Robotics vs. laparoscopy in spleen-preserving distal pancreatectomy in the IGOMIPS registry: when glitter does not equal superiority. Aligning perspectives: towards a standardized concept of "complexity" in thyroid surgery. An international web-based survey. Reflections on surgery for hiatal hernia. Fusion of machine learning models using fuzzy comprehensive evaluation for thymoma risk prediction: a multicenter analysis. The impact of socio-economic disparities on kidney transplant outcomes: insights from a monocentric Italian study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1