建立一个新的评分系统,并用它来评估和比较 ChatGPT-4 咨询与妇产科医生咨询的质量:试点研究。

IF 2.6 3区 医学 Q2 OBSTETRICS & GYNECOLOGY International Journal of Gynecology & Obstetrics Pub Date : 2024-09-28 DOI:10.1002/ijgo.15934
Lan Lan, Ling Yang, Jinyan Li, Jia Hou, Yunsheng Yan, Yaozong Zhang
{"title":"建立一个新的评分系统,并用它来评估和比较 ChatGPT-4 咨询与妇产科医生咨询的质量:试点研究。","authors":"Lan Lan, Ling Yang, Jinyan Li, Jia Hou, Yunsheng Yan, Yaozong Zhang","doi":"10.1002/ijgo.15934","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>In the current study, we aimed to establish a quantified scoring system for evaluating consultation quality. Subsequently, using the score system to assess the quality of ChatGPT-4 consultations, we compared them with physician consultations when presented with the same clinical cases from obstetrics and gynecology.</p><p><strong>Methods: </strong>This study was conducted in the Women and Children's Hospital of Chongqing Medical University, a tertiary-care hospital with approximately 16 000-20 000 deliveries and 8500-12 000 gynecologic surgeries per year. The detailed data from obstetric and gynecologic medical records were analyzed by ChatGPT-4 and physicians; the consultation opinions were then generated respectively. All consultation opinions were graded by eight junior doctors using the novel score system; subsequently, the correlation, agreement, and comparison between the two types of consultation opinions were then evaluated.</p><p><strong>Results: </strong>A total of 100 medical records from obstetrics and 100 medical records from gynecology were randomly selected. Pearson correlation analysis suggested a noncorrelation or weak correlation between consultations from ChatGPT-4 and physicians. Bland-Altman plot showed an unacceptable agreement between the two types of consultation opinions. Paired t tests showed that the scores of physician consultations were significantly higher than those generated by ChatGPT-4 in both obstetric and gynecologic patients.</p><p><strong>Conclusion: </strong>At present, ChatGPT-4 may not be a substitute for physicians in consultations for obstetric and gynecologic patients. Therefore, it is crucial to pay careful attention and conduct ongoing evaluations to ensure the quality of consultation opinions generated by ChatGPT-4.</p>","PeriodicalId":14164,"journal":{"name":"International Journal of Gynecology & Obstetrics","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Establishing a novel score system and using it to assess and compare the quality of ChatGPT-4 consultation with physician consultation for obstetrics and gynecology: A pilot study.\",\"authors\":\"Lan Lan, Ling Yang, Jinyan Li, Jia Hou, Yunsheng Yan, Yaozong Zhang\",\"doi\":\"10.1002/ijgo.15934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>In the current study, we aimed to establish a quantified scoring system for evaluating consultation quality. Subsequently, using the score system to assess the quality of ChatGPT-4 consultations, we compared them with physician consultations when presented with the same clinical cases from obstetrics and gynecology.</p><p><strong>Methods: </strong>This study was conducted in the Women and Children's Hospital of Chongqing Medical University, a tertiary-care hospital with approximately 16 000-20 000 deliveries and 8500-12 000 gynecologic surgeries per year. The detailed data from obstetric and gynecologic medical records were analyzed by ChatGPT-4 and physicians; the consultation opinions were then generated respectively. All consultation opinions were graded by eight junior doctors using the novel score system; subsequently, the correlation, agreement, and comparison between the two types of consultation opinions were then evaluated.</p><p><strong>Results: </strong>A total of 100 medical records from obstetrics and 100 medical records from gynecology were randomly selected. Pearson correlation analysis suggested a noncorrelation or weak correlation between consultations from ChatGPT-4 and physicians. Bland-Altman plot showed an unacceptable agreement between the two types of consultation opinions. Paired t tests showed that the scores of physician consultations were significantly higher than those generated by ChatGPT-4 in both obstetric and gynecologic patients.</p><p><strong>Conclusion: </strong>At present, ChatGPT-4 may not be a substitute for physicians in consultations for obstetric and gynecologic patients. Therefore, it is crucial to pay careful attention and conduct ongoing evaluations to ensure the quality of consultation opinions generated by ChatGPT-4.</p>\",\"PeriodicalId\":14164,\"journal\":{\"name\":\"International Journal of Gynecology & Obstetrics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Gynecology & Obstetrics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/ijgo.15934\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OBSTETRICS & GYNECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Gynecology & Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ijgo.15934","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

研究目的在本研究中,我们旨在建立一套量化的咨询质量评估评分系统。随后,我们使用该评分系统评估 ChatGPT-4 咨询质量,并将其与妇产科医生在遇到相同临床病例时的咨询进行比较:本研究在重庆医科大学附属妇女儿童医院进行,该医院为三级甲等医院,每年约有 16000-20000 例分娩和 8500-12000 例妇科手术。由 ChatGPT-4 和医生对产科和妇科病历的详细数据进行分析,然后分别得出会诊意见。所有会诊意见均由 8 名初级医生使用新颖的评分系统进行评分,然后对两种会诊意见之间的相关性、一致性和对比性进行评估:结果:随机抽取了 100 份产科病历和 100 份妇科病历。皮尔逊相关分析表明,ChatGPT-4 和医生的咨询意见之间不相关或相关性较弱。Bland-Altman图显示,两种咨询意见的一致性无法接受。配对 t 检验显示,在产科和妇科患者中,医生会诊的评分明显高于 ChatGPT-4 得出的评分:结论:目前,ChatGPT-4 可能无法替代医生对妇产科患者进行会诊。结论:目前,ChatGPT-4 可能无法替代医生为妇产科患者提供会诊服务,因此,仔细关注并持续评估 ChatGPT-4 生成的会诊意见的质量至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Establishing a novel score system and using it to assess and compare the quality of ChatGPT-4 consultation with physician consultation for obstetrics and gynecology: A pilot study.

Objectives: In the current study, we aimed to establish a quantified scoring system for evaluating consultation quality. Subsequently, using the score system to assess the quality of ChatGPT-4 consultations, we compared them with physician consultations when presented with the same clinical cases from obstetrics and gynecology.

Methods: This study was conducted in the Women and Children's Hospital of Chongqing Medical University, a tertiary-care hospital with approximately 16 000-20 000 deliveries and 8500-12 000 gynecologic surgeries per year. The detailed data from obstetric and gynecologic medical records were analyzed by ChatGPT-4 and physicians; the consultation opinions were then generated respectively. All consultation opinions were graded by eight junior doctors using the novel score system; subsequently, the correlation, agreement, and comparison between the two types of consultation opinions were then evaluated.

Results: A total of 100 medical records from obstetrics and 100 medical records from gynecology were randomly selected. Pearson correlation analysis suggested a noncorrelation or weak correlation between consultations from ChatGPT-4 and physicians. Bland-Altman plot showed an unacceptable agreement between the two types of consultation opinions. Paired t tests showed that the scores of physician consultations were significantly higher than those generated by ChatGPT-4 in both obstetric and gynecologic patients.

Conclusion: At present, ChatGPT-4 may not be a substitute for physicians in consultations for obstetric and gynecologic patients. Therefore, it is crucial to pay careful attention and conduct ongoing evaluations to ensure the quality of consultation opinions generated by ChatGPT-4.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.80
自引率
2.60%
发文量
493
审稿时长
3-6 weeks
期刊介绍: The International Journal of Gynecology & Obstetrics publishes articles on all aspects of basic and clinical research in the fields of obstetrics and gynecology and related subjects, with emphasis on matters of worldwide interest.
期刊最新文献
Routine ultrasound does not improve instrument placement at operative vaginal delivery: An updated systematic review and meta-analysis. Beyond borders: The global impact of violating reproductive human rights. Trustworthiness criteria for meta-analyses of randomized controlled studies: OBGYN Journal guidelines. Menstrual management using the etonogestrel implant in individuals with intellectual disabilities in Joinville, Brazil. Anticoagulant therapy in pregnant women with mechanical and bioprosthetic heart valves.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1