A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.
{"title":"A cross-sectional study to evaluate responses generated by two AI software programs for common patient queries about laparoscopic repair of inguinal hernia.","authors":"Meeran Banday, Kirat Kaur","doi":"10.1007/s13304-025-02158-5","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.</p>","PeriodicalId":23391,"journal":{"name":"Updates in Surgery","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Updates in Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13304-025-02158-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
This study aimed to evaluate the quality and accuracy of responses provided by two user-interactive AI chatbots, namely ChatGPT and ChatSonic, in response to patient queries regarding laparoscopic repair of inguinal hernias, and additionally determine the suitability of these chatbots in addressing patient queries related to inguinal hernia repair. Ten questions regarding laparoscopic repair of inguinal hernias were developed and presented to ChatGPT 4.0 and ChatSonic. Responses were evaluated by two experienced surgeons blinded to the source, using the Global Quality Score (GQS) and modified DISCERN Score to gauge response quality and reliability. ChatGPT demonstrated high-quality responses (GQS = 4 & 5) for all ten questions according to one evaluator, and for seven out of ten questions according to the other. Similarly, ChatGPT showed high reliability (DISCERN = 4 & 5) for nine responses according to one evaluator, and for three responses according to the other, with only slight agreement between evaluators for both GQS (kappa = 0.20) and modified DISCERN scores (kappa = 0.08). ChatSonic also provided high-quality and reliable responses for a majority of questions, albeit to a lesser extent than ChatGPT, and both demonstrating limited concordance in responses (p > 0.05). Overall, Both ChatGPT and ChatSonic demonstrated potential utility in providing responses to patient queries about hernia surgery. However, due to inconsistencies in reliability and quality, ongoing refinement and validation of AI generated medical information remain necessary before widespread clinical adoption.
期刊介绍:
Updates in Surgery (UPIS) has been founded in 2010 as the official journal of the Italian Society of Surgery. It’s an international, English-language, peer-reviewed journal dedicated to the surgical sciences. Its main goal is to offer a valuable update on the most recent developments of those surgical techniques that are rapidly evolving, forcing the community of surgeons to a rigorous debate and a continuous refinement of standards of care. In this respect position papers on the mostly debated surgical approaches and accreditation criteria have been published and are welcome for the future.
Beside its focus on general surgery, the journal draws particular attention to cutting edge topics and emerging surgical fields that are publishing in monothematic issues guest edited by well-known experts.
Updates in Surgery has been considering various types of papers: editorials, comprehensive reviews, original studies and technical notes related to specific surgical procedures and techniques on liver, colorectal, gastric, pancreatic, robotic and bariatric surgery.