The potential of ChatGPT in medicine: an example analysis of nephrology specialty exams in Poland

IF 3.9 2区医学 Q1 UROLOGY & NEPHROLOGY Clinical Kidney Journal Pub Date : 2024-06-22 DOI:10.1093/ckj/sfae193

Jan Nicikowski, Mikołaj Szczepański, Miłosz Miedziaszczyk, Bartosz Kudliński

{"title":"The potential of ChatGPT in medicine: an example analysis of nephrology specialty exams in Poland","authors":"Jan Nicikowski, Mikołaj Szczepański, Miłosz Miedziaszczyk, Bartosz Kudliński","doi":"10.1093/ckj/sfae193","DOIUrl":null,"url":null,"abstract":"Background and hypothesis In November 2022, OpenAI released a chatbot named ChatGPT, a product capable of processing natural language to create human-like conversational dialogue. It has generated a lot of interest, including from the scientific community as well as the medical science community. Recent publications have shown that ChatGPT can correctly answer questions from medical exams such as the United States Medical Licensing Examination (USMLE) and other specialty exams. To date, there have been no studies in which ChatGPT has been tested on specialty questions in the field of nephrology anywhere in the world. Methods Using the ChatGPT-3.5 and 4.0 algorithm in this comparative cross-sectional study, we analyzed 1560 single-answer questions from the national specialty exam in nephrology from 2017 to 2023 that were available in the Polish Medical Examination Center's question database along with answer keys. Results Of the 1556 questions posed to ChatGPT-4.0, correct answers were obtained with an accuracy of 69.84%, compared to ChatGPT-3.5 (45.70%, P = .0001) and to the top results of medical doctors (85.73%, P = .0001). Of the 13 tests, ChatGPT-4.0 exceeded the required ≥60% pass rate in 11 tests passed, and scored higher than the average of the human exam results. Conclusion ChatGPT-3.5 was not spectacularly successful in nephrology exams. The ChatGPT-4.0 algorithm was able to pass most of the analyzed nephrology specialty exams. New generations of ChatGPT achieve similar results to humans. The best results of humans are better than ChatGPT-4.0.","PeriodicalId":10435,"journal":{"name":"Clinical Kidney Journal","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Kidney Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ckj/sfae193","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background and hypothesis In November 2022, OpenAI released a chatbot named ChatGPT, a product capable of processing natural language to create human-like conversational dialogue. It has generated a lot of interest, including from the scientific community as well as the medical science community. Recent publications have shown that ChatGPT can correctly answer questions from medical exams such as the United States Medical Licensing Examination (USMLE) and other specialty exams. To date, there have been no studies in which ChatGPT has been tested on specialty questions in the field of nephrology anywhere in the world. Methods Using the ChatGPT-3.5 and 4.0 algorithm in this comparative cross-sectional study, we analyzed 1560 single-answer questions from the national specialty exam in nephrology from 2017 to 2023 that were available in the Polish Medical Examination Center's question database along with answer keys. Results Of the 1556 questions posed to ChatGPT-4.0, correct answers were obtained with an accuracy of 69.84%, compared to ChatGPT-3.5 (45.70%, P = .0001) and to the top results of medical doctors (85.73%, P = .0001). Of the 13 tests, ChatGPT-4.0 exceeded the required ≥60% pass rate in 11 tests passed, and scored higher than the average of the human exam results. Conclusion ChatGPT-3.5 was not spectacularly successful in nephrology exams. The ChatGPT-4.0 algorithm was able to pass most of the analyzed nephrology specialty exams. New generations of ChatGPT achieve similar results to humans. The best results of humans are better than ChatGPT-4.0.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ChatGPT 在医学中的潜力：波兰肾脏病专业考试实例分析

背景与假设 2022 年 11 月，OpenAI 发布了一款名为 ChatGPT 的聊天机器人，这是一款能够处理自然语言以创建类人对话的产品。它引起了包括科学界和医学界在内的广泛关注。最近的出版物显示，ChatGPT 可以正确回答美国医学执业资格考试（USMLE）和其他专业考试等医学考试中的问题。迄今为止，世界上还没有任何研究对 ChatGPT 在肾脏病学领域的专业问题上进行过测试。方法在这项横断面对比研究中，我们使用 ChatGPT-3.5 和 4.0 算法分析了波兰医学考试中心试题库中 2017 年至 2023 年肾脏病学国家专业考试的 1560 道单选题和答案。结果在向 ChatGPT-4.0 提出的 1556 个问题中，正确答案的准确率为 69.84%，与 ChatGPT-3.5 （45.70%，P = .0001）和医生的最高成绩（85.73%，P = .0001）相比，准确率更高。在 13 项测试中，ChatGPT-4.0 有 11 项测试的通过率超过了≥60% 的要求，得分高于人类考试成绩的平均值。结论 ChatGPT-3.5 在肾脏病学考试中并没有取得惊人的成功。ChatGPT-4.0 算法能够通过大部分肾脏病学专业考试。新一代 ChatGPT 取得了与人类相似的结果。人类的最佳结果优于 ChatGPT-4.0。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Clinical Kidney Journal Medicine-Transplantation

CiteScore

6.70

自引率

10.90%

发文量

242

审稿时长

8 weeks

期刊介绍： About the Journal Clinical Kidney Journal: Clinical and Translational Nephrology (ckj), an official journal of the ERA-EDTA (European Renal Association-European Dialysis and Transplant Association), is a fully open access, online only journal publishing bimonthly. The journal is an essential educational and training resource integrating clinical, translational and educational research into clinical practice. ckj aims to contribute to a translational research culture among nephrologists and kidney pathologists that helps close the gap between basic researchers and practicing clinicians and promote sorely needed innovation in the Nephrology field. All research articles in this journal have undergone peer review.