Performance of artificial intelligence on a simulated Canadian urology board exam

IF 2 Q3 Medicine Canadian Urological Association Journal Pub Date : 2024-06-10 DOI:10.5489/cuaj.8800

N. Touma, Jessica E. Caterini, Kiera Liblk

{"title":"Performance of artificial intelligence on a simulated Canadian urology board exam","authors":"N. Touma, Jessica E. Caterini, Kiera Liblk","doi":"10.5489/cuaj.8800","DOIUrl":null,"url":null,"abstract":"Introduction: Generative artificial intelligence (AI) has proven to be a powerful tool with increasing applications in clinical care and medical education. CHATGPT has performed adequately on many specialty certification and knowledge assessment exams. The objective of this study was to assess the performance of CHATGPT 4 on a multiple-choice exam meant to simulate the Canadian urology board exam.\nMethods: Graduating urology residents representing all Canadian training programs gather yearly for a mock exam that simulates their upcoming board-certifying exam. The exam consists of written multiple-choice questions (MCQs) and an oral objective structured clinical examination (OSCE). The 2022 exam was taken by 29 graduating residents and was administered to CHATGPT 4.\nResults: CHATGPT 4 scored 46% on the MCQ exam, whereas the mean and median scores of graduating urology residents were 62.6%, and 62.7%, respectively. This would place CHATGPT's score 1.8 standard deviations from the median. The percentile rank of CHATGPT would be in the sixth percentile. CHATGPT scores on different topics of the exam were as follows: oncology 35%, andrology/benign prostatic hyperplasia 62%, physiology/anatomy 67%, incontinence/female urology 23%, infections 71%, urolithiasis 57%, and trauma/reconstruction 17%, with ChatGPT 4’s oncology performance being significantly below that of postgraduate year 5 residents.\nConclusions: CHATGPT 4 underperforms on an MCQ exam meant to simulate the Canadian board exam. Ongoing assessments of the capability of generative AI is needed as these models evolve and are trained on additional urology content.","PeriodicalId":38001,"journal":{"name":"Canadian Urological Association Journal","volume":" 550","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Urological Association Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5489/cuaj.8800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Generative artificial intelligence (AI) has proven to be a powerful tool with increasing applications in clinical care and medical education. CHATGPT has performed adequately on many specialty certification and knowledge assessment exams. The objective of this study was to assess the performance of CHATGPT 4 on a multiple-choice exam meant to simulate the Canadian urology board exam. Methods: Graduating urology residents representing all Canadian training programs gather yearly for a mock exam that simulates their upcoming board-certifying exam. The exam consists of written multiple-choice questions (MCQs) and an oral objective structured clinical examination (OSCE). The 2022 exam was taken by 29 graduating residents and was administered to CHATGPT 4. Results: CHATGPT 4 scored 46% on the MCQ exam, whereas the mean and median scores of graduating urology residents were 62.6%, and 62.7%, respectively. This would place CHATGPT's score 1.8 standard deviations from the median. The percentile rank of CHATGPT would be in the sixth percentile. CHATGPT scores on different topics of the exam were as follows: oncology 35%, andrology/benign prostatic hyperplasia 62%, physiology/anatomy 67%, incontinence/female urology 23%, infections 71%, urolithiasis 57%, and trauma/reconstruction 17%, with ChatGPT 4’s oncology performance being significantly below that of postgraduate year 5 residents. Conclusions: CHATGPT 4 underperforms on an MCQ exam meant to simulate the Canadian board exam. Ongoing assessments of the capability of generative AI is needed as these models evolve and are trained on additional urology content.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

人工智能在模拟加拿大泌尿外科委员会考试中的表现

引言事实证明，人工智能（AI）是一种强大的工具，在临床护理和医学教育中的应用日益广泛。CHATGPT 在许多专科认证和知识评估考试中表现出色。本研究的目的是评估 CHATGPT 4 在模拟加拿大泌尿外科医师资格考试的多项选择考试中的表现：方法：代表加拿大所有培训项目的泌尿科住院医师毕业生每年都会聚集在一起参加模拟考试，模拟即将到来的委员会认证考试。考试包括笔试选择题（MCQ）和口试客观结构化临床考试（OSCE）。29 名即将毕业的住院医师参加了 2022 年的考试，并对 CHATGPT 4.Results 进行了测试：结果：CHATGPT 4 的 MCQ 考试得分率为 46%，而泌尿外科毕业住院医师的平均得分率和中位得分率分别为 62.6% 和 62.7%。因此，CHATGPT 的分数与中位数相差 1.8 个标准差。CHATGPT 的百分位数排在第六位。CHATGPT 在不同考试题目上的得分如下：肿瘤 35%、泌尿外科/良性前列腺增生 62%、生理学/解剖学 67%、尿失禁/女性泌尿外科 23%、感染 71%、泌尿系结石 57%、创伤/重建 17%，其中 ChatGPT 4 在肿瘤方面的表现明显低于研究生五年级住院医师：结论：CHATGPT 4 在模拟加拿大执业医师考试的 MCQ 考试中表现不佳。结论：CHATGPT 4 在模拟加拿大执业医师考试的 MCQ 考试中表现不佳。随着这些模型的发展和在更多泌尿外科内容上的训练，需要对生成式人工智能的能力进行持续评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Canadian Urological Association Journal Medicine-Urology

CiteScore

2.10

自引率

0.00%

发文量

167

期刊介绍： Published by the Canadian Urological Association, the Canadian Urological Association Journal (CUAJ) released its first issue in March 2007, and was published four times that year under the guidance of founding editor (Editor Emeritus as of 2012), Dr. Laurence H. Klotz. In 2008, CUAJ became a bimonthly publication. As of 2013, articles have been published monthly, alternating between print and online-only versions (print issues are available in February, April, June, August, October, and December; online-only issues are produced in January, March, May, July, September, and November). In 2017, the journal launched an ahead-of-print publishing strategy, in which accepted manuscripts are published electronically on our website and cited on PubMed ahead of their official issue-based publication date. By significantly shortening the time to article availability, we offer our readers more flexibility in the way they engage with our content: as a continuous stream, or in a monthly “package,” or both. CUAJ covers a broad range of urological topics — oncology, pediatrics, transplantation, endourology, female urology, infertility, and more. We take pride in showcasing the work of some of Canada’s top investigators and providing our readers with the latest relevant evidence-based research, and on being the primary repository for major guidelines and other important practice recommendations. Our long-term vision is to become an essential destination for urology-based research, education, and advocacy for both physicians and patients, and to act as a springboard for discussions within the urologic community.

期刊最新文献

Safety and efficacy of ultrasound-assisted bedside ureteric stent placement 2024 Canadian Urological Association endorsement of an expert report: Kidney involvement in tuberous sclerosis complex Sacral neuromodulation in pediatric refractory bladder and bowel dysfunction Fostering the continued growth of our association On vacation