Use of ChatGPT to Generate Informed Consent for Surgery in Urogynecology.

IF 0.8 Q4 OBSTETRICS & GYNECOLOGY Urogynecology (Hagerstown, Md.) Pub Date : 2025-01-17 DOI:10.1097/SPV.0000000000001638
Emily S Johnson, Eva K Welch, Jacqueline Kikuchi, Heather Barbier, Christine M Vaccaro, Felicia Balzano, Katherine L Dengler
{"title":"Use of ChatGPT to Generate Informed Consent for Surgery in Urogynecology.","authors":"Emily S Johnson, Eva K Welch, Jacqueline Kikuchi, Heather Barbier, Christine M Vaccaro, Felicia Balzano, Katherine L Dengler","doi":"10.1097/SPV.0000000000001638","DOIUrl":null,"url":null,"abstract":"<p><strong>Importance: </strong>Use of the publicly available Large Language Model, Chat Generative Pre-trained Transformer (ChatGPT 3.5; OpenAI, 2022), is growing in health care despite varying accuracies.</p><p><strong>Objective: </strong>The aim of this study was to assess the accuracy and readability of ChatGPT's responses to questions encompassing surgical informed consent in urogynecology.</p><p><strong>Study design: </strong>Five fellowship-trained urogynecology attending physicians and 1 reconstructive female urologist evaluated ChatGPT's responses to questions about 4 surgical procedures: (1) retropubic midurethral sling, (2) total vaginal hysterectomy, (3) uterosacral ligament suspension, and (4) sacrocolpopexy. Questions involved procedure descriptions, risks/benefits/alternatives, and additional resources. Responses were rated using the DISCERN tool, a 4-point accuracy scale, and the Flesch-Kinkaid Grade Level score.</p><p><strong>Results: </strong>The median DISCERN tool overall rating was 3 (interquartile range [IQR], 3-4), indicating a moderate rating (\"potentially important but not serious shortcomings\"). Retropubic midurethral sling received the highest overall score (median, 4; IQR, 3-4), and uterosacral ligament suspension received the lowest (median, 3; IQR, 3-3). Using the 4-point accuracy scale, 44.0% of responses received a score of 4 (\"correct and adequate\"), 22.6% received a score of 3 (\"correct but insufficient\"), 29.8% received a score of 2 (\"accurate and misleading information together\"), and 3.6% received a score of 1 (\"wrong or irrelevant answer\"). ChatGPT performance was poor for discussion of benefits and alternatives for all surgical procedures, with some responses being inaccurate. The mean Flesch-Kinkaid Grade Level score for all responses was 17.5 (SD, 2.1), corresponding to a postgraduate reading level.</p><p><strong>Conclusions: </strong>Overall, ChatGPT generated accurate responses to questions about surgical informed consent. However, it produced clearly false portions of responses, highlighting the need for a careful review of responses by qualified health care professionals.</p>","PeriodicalId":75288,"journal":{"name":"Urogynecology (Hagerstown, Md.)","volume":" ","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Urogynecology (Hagerstown, Md.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1097/SPV.0000000000001638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Importance: Use of the publicly available Large Language Model, Chat Generative Pre-trained Transformer (ChatGPT 3.5; OpenAI, 2022), is growing in health care despite varying accuracies.

Objective: The aim of this study was to assess the accuracy and readability of ChatGPT's responses to questions encompassing surgical informed consent in urogynecology.

Study design: Five fellowship-trained urogynecology attending physicians and 1 reconstructive female urologist evaluated ChatGPT's responses to questions about 4 surgical procedures: (1) retropubic midurethral sling, (2) total vaginal hysterectomy, (3) uterosacral ligament suspension, and (4) sacrocolpopexy. Questions involved procedure descriptions, risks/benefits/alternatives, and additional resources. Responses were rated using the DISCERN tool, a 4-point accuracy scale, and the Flesch-Kinkaid Grade Level score.

Results: The median DISCERN tool overall rating was 3 (interquartile range [IQR], 3-4), indicating a moderate rating ("potentially important but not serious shortcomings"). Retropubic midurethral sling received the highest overall score (median, 4; IQR, 3-4), and uterosacral ligament suspension received the lowest (median, 3; IQR, 3-3). Using the 4-point accuracy scale, 44.0% of responses received a score of 4 ("correct and adequate"), 22.6% received a score of 3 ("correct but insufficient"), 29.8% received a score of 2 ("accurate and misleading information together"), and 3.6% received a score of 1 ("wrong or irrelevant answer"). ChatGPT performance was poor for discussion of benefits and alternatives for all surgical procedures, with some responses being inaccurate. The mean Flesch-Kinkaid Grade Level score for all responses was 17.5 (SD, 2.1), corresponding to a postgraduate reading level.

Conclusions: Overall, ChatGPT generated accurate responses to questions about surgical informed consent. However, it produced clearly false portions of responses, highlighting the need for a careful review of responses by qualified health care professionals.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用ChatGPT生成泌尿妇科手术知情同意。
重要性:使用公开可用的大型语言模型,聊天生成预训练转换器(ChatGPT 3.5;OpenAI, 2022年),在医疗保健领域不断发展,尽管准确性不一。目的:本研究的目的是评估ChatGPT对泌尿妇科手术知情同意问题的回答的准确性和可读性。研究设计:5名接受过奖学金培训的泌尿妇科主治医生和1名女性泌尿外科医生评估了ChatGPT对4种手术方法的回答:(1)耻骨后尿道中悬吊术,(2)阴道全子宫切除术,(3)子宫骶韧带悬吊术,(4)骶colpop固定术。问题涉及程序描述、风险/收益/替代方案和其他资源。使用DISCERN工具、4分准确度量表和Flesch-Kinkaid Grade Level分数对回答进行评分。结果:DISCERN工具总体评分中位数为3(四分位数范围[IQR], 3-4),表明评级中等(“潜在重要但不严重的缺点”)。耻骨后尿道中悬吊总分最高(中位数,4分;IQR, 3-4),子宫骶韧带悬吊的评分最低(中位数,3;差,3 - 3)。使用4点准确度量表,44.0%的回答得到4分(“正确和充分”),22.6%得到3分(“正确但不充分”),29.8%得到2分(“准确和误导性信息”),3.6%得到1分(“错误或不相关的答案”)。ChatGPT在讨论所有外科手术的益处和替代方案时表现不佳,有些反应不准确。所有回答的Flesch-Kinkaid Grade Level平均得分为17.5 (SD, 2.1),相当于研究生的阅读水平。结论:总体而言,ChatGPT对手术知情同意的问题给出了准确的回答。然而,它提出了答复中明显错误的部分,突出表明有必要由合格的保健专业人员仔细审查答复。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
期刊最新文献
Adverse Events Associated With Female External Urinary Collection Devices. Tamsulosin to Prevent Urinary Retention After Vaginal Urogynecologic Surgery. The Effect of Concomitant Hysterectomy Route on Robotic Sacrocolpopexy Outcomes. Urogynecology Research Is Underrepresented in Top Obstetrics and Gynecology Journals. Treatment Crossover Following Advanced Therapy for Overactive Bladder Syndrome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1