A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.

IF 2.4 3区医学 Q2 SURGERY Updates in Surgery Pub Date : 2025-03-05 DOI:10.1007/s13304-025-02158-5

Meeran Banday, Kirat Kaur

{"title":"A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.","authors":"Meeran Banday, Kirat Kaur","doi":"10.1007/s13304-025-02158-5","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.</p>","PeriodicalId":23391,"journal":{"name":"Updates in Surgery","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Updates in Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13304-025-02158-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一项横断面研究，旨在评估两款人工智能软件程序针对腹腔镜腹股沟疝修补术常见患者询问所生成的回复。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Updates in Surgery Medicine-Surgery

CiteScore

4.50

自引率

7.70%

发文量

208

期刊介绍： Updates in Surgery (UPIS) has been founded in 2010 as the official journal of the Italian Society of Surgery. It’s an international, English-language, peer-reviewed journal dedicated to the surgical sciences. Its main goal is to offer a valuable update on the most recent developments of those surgical techniques that are rapidly evolving, forcing the community of surgeons to a rigorous debate and a continuous refinement of standards of care. In this respect position papers on the mostly debated surgical approaches and accreditation criteria have been published and are welcome for the future. Beside its focus on general surgery, the journal draws particular attention to cutting edge topics and emerging surgical fields that are publishing in monothematic issues guest edited by well-known experts. Updates in Surgery has been considering various types of papers: editorials, comprehensive reviews, original studies and technical notes related to specific surgical procedures and techniques on liver, colorectal, gastric, pancreatic, robotic and bariatric surgery.