使用生成式大型语言模型对常见手术条件进行患者教育:ChatGPT和谷歌Gemini的比较分析。

IF 2.4 3区 医学 Q2 SURGERY Updates in Surgery Pub Date : 2025-01-15 DOI:10.1007/s13304-025-02074-8
Omar Mahmoud ELSenbawy, Keval Bhavesh Patel, Randev Ayodhya Wannakuwatte, Akhila N Thota
{"title":"使用生成式大型语言模型对常见手术条件进行患者教育:ChatGPT和谷歌Gemini的比较分析。","authors":"Omar Mahmoud ELSenbawy, Keval Bhavesh Patel, Randev Ayodhya Wannakuwatte, Akhila N Thota","doi":"10.1007/s13304-025-02074-8","DOIUrl":null,"url":null,"abstract":"<p><p>There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids. A cross-sectional original research design was conducted regarding the responses generated by ChatGPT and Google Gemini for the post-surgical complications of Deep vein thrombosis, decubitus ulcers, and hemorrhoids. Each response was evaluated by the Flesch-Kincaid calculator for total number of words, sentences, average words per sentence, average syllables per word, grade level, and ease score. Additionally, the similarity score was evaluated using QuillBot and reliability using a modified discern score. These results were then analyzed by the unpaired or two sample t-test to compare the averages between the two AI tools to conclude which one was superior. Chat GPT required a higher education level to understand as suggested by the higher grade levels and lower ease scores. The easiest brochure was for deep vein thrombosis which had the lowest ease score and highest grade level. ChatGPT displayed more similarity with information provided on the internet as calculated by the plagiarism calculator-Quill bot. The reliability score via the Modified Discern score showing both AI tools were similar. Although there is a difference in the various scores for each AI tool, based on the P values obtained there is not enough evidence to conclude the superiority of one AI tool over the other.</p>","PeriodicalId":23391,"journal":{"name":"Updates in Surgery","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Use of generative large language models for patient education on common surgical conditions: a comparative analysis between ChatGPT and Google Gemini.\",\"authors\":\"Omar Mahmoud ELSenbawy, Keval Bhavesh Patel, Randev Ayodhya Wannakuwatte, Akhila N Thota\",\"doi\":\"10.1007/s13304-025-02074-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids. A cross-sectional original research design was conducted regarding the responses generated by ChatGPT and Google Gemini for the post-surgical complications of Deep vein thrombosis, decubitus ulcers, and hemorrhoids. Each response was evaluated by the Flesch-Kincaid calculator for total number of words, sentences, average words per sentence, average syllables per word, grade level, and ease score. Additionally, the similarity score was evaluated using QuillBot and reliability using a modified discern score. These results were then analyzed by the unpaired or two sample t-test to compare the averages between the two AI tools to conclude which one was superior. Chat GPT required a higher education level to understand as suggested by the higher grade levels and lower ease scores. The easiest brochure was for deep vein thrombosis which had the lowest ease score and highest grade level. ChatGPT displayed more similarity with information provided on the internet as calculated by the plagiarism calculator-Quill bot. The reliability score via the Modified Discern score showing both AI tools were similar. Although there is a difference in the various scores for each AI tool, based on the P values obtained there is not enough evidence to conclude the superiority of one AI tool over the other.</p>\",\"PeriodicalId\":23391,\"journal\":{\"name\":\"Updates in Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Updates in Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s13304-025-02074-8\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Updates in Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13304-025-02074-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

让患者方便地获取有关其医疗状况的信息,以增进他们对医疗保健决策的理解和参与,这一点越来越重要。人工智能(AI)已被证明是一种快速、高效和有效的工具,可以教育患者了解他们的医疗状况。该研究的目的是比较人工智能工具、ChatGPT和谷歌Gemini提供的响应,以评估为深静脉血栓形成、褥疮溃疡和痔疮等医疗状况提供的信息的简洁性和可理解性。对ChatGPT和谷歌Gemini对术后并发症深静脉血栓形成、褥疮溃疡、痔疮的反应进行横断面原创性研究设计。每个回答都用Flesch-Kincaid计算器对单词总数、句子数量、平均每句单词数量、平均每个单词的音节数量、年级水平和轻松得分进行评估。此外,使用QuillBot评估相似性评分,使用改进的辨别评分评估可靠性。然后通过未配对或双样本t检验对这些结果进行分析,以比较两种人工智能工具之间的平均值,以得出哪一种工具更优越。Chat GPT需要更高的教育水平才能理解,这意味着更高的年级水平和更低的轻松分数。深静脉血栓形成是最容易的,其易度评分最低,分级水平最高。通过剽窃计算器- quill bot计算,ChatGPT与互联网上提供的信息显示出更多的相似性。通过修改后的辨别分数得出的可靠性分数表明,这两种人工智能工具是相似的。尽管每个人工智能工具的各种得分存在差异,但根据所获得的P值,没有足够的证据来得出一个人工智能工具优于另一个的结论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Use of generative large language models for patient education on common surgical conditions: a comparative analysis between ChatGPT and Google Gemini.

There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids. A cross-sectional original research design was conducted regarding the responses generated by ChatGPT and Google Gemini for the post-surgical complications of Deep vein thrombosis, decubitus ulcers, and hemorrhoids. Each response was evaluated by the Flesch-Kincaid calculator for total number of words, sentences, average words per sentence, average syllables per word, grade level, and ease score. Additionally, the similarity score was evaluated using QuillBot and reliability using a modified discern score. These results were then analyzed by the unpaired or two sample t-test to compare the averages between the two AI tools to conclude which one was superior. Chat GPT required a higher education level to understand as suggested by the higher grade levels and lower ease scores. The easiest brochure was for deep vein thrombosis which had the lowest ease score and highest grade level. ChatGPT displayed more similarity with information provided on the internet as calculated by the plagiarism calculator-Quill bot. The reliability score via the Modified Discern score showing both AI tools were similar. Although there is a difference in the various scores for each AI tool, based on the P values obtained there is not enough evidence to conclude the superiority of one AI tool over the other.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Updates in Surgery
Updates in Surgery Medicine-Surgery
CiteScore
4.50
自引率
7.70%
发文量
208
期刊介绍: Updates in Surgery (UPIS) has been founded in 2010 as the official journal of the Italian Society of Surgery. It’s an international, English-language, peer-reviewed journal dedicated to the surgical sciences. Its main goal is to offer a valuable update on the most recent developments of those surgical techniques that are rapidly evolving, forcing the community of surgeons to a rigorous debate and a continuous refinement of standards of care. In this respect position papers on the mostly debated surgical approaches and accreditation criteria have been published and are welcome for the future. Beside its focus on general surgery, the journal draws particular attention to cutting edge topics and emerging surgical fields that are publishing in monothematic issues guest edited by well-known experts. Updates in Surgery has been considering various types of papers: editorials, comprehensive reviews, original studies and technical notes related to specific surgical procedures and techniques on liver, colorectal, gastric, pancreatic, robotic and bariatric surgery.
期刊最新文献
Analysis of histological features and recurrence risk assessment of papillary thyroid carcinoma according to presurgery FNAC category. A prospective observational study of laparoscopic approaches for suspected gallbladder cancer in Yamaguchi (YPB-002 LAGBY). Outcome prediction after emergency cholecystectomy: performance evaluation of the ACS-NSQIP surgical risk calculator and the 5-item modified frailty index. Current perspectives on living donor selection in liver transplantation. Textbook outcome following pancreaticoduodenectomy in elderly patients: age-stratified analysis and predictive factors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1