{"title":"人工智能在模拟加拿大泌尿外科委员会考试中的表现","authors":"N. Touma, Jessica E. Caterini, Kiera Liblk","doi":"10.5489/cuaj.8800","DOIUrl":null,"url":null,"abstract":"Introduction: Generative artificial intelligence (AI) has proven to be a powerful tool with increasing applications in clinical care and medical education. CHATGPT has performed adequately on many specialty certification and knowledge assessment exams. The objective of this study was to assess the performance of CHATGPT 4 on a multiple-choice exam meant to simulate the Canadian urology board exam.\nMethods: Graduating urology residents representing all Canadian training programs gather yearly for a mock exam that simulates their upcoming board-certifying exam. The exam consists of written multiple-choice questions (MCQs) and an oral objective structured clinical examination (OSCE). The 2022 exam was taken by 29 graduating residents and was administered to CHATGPT 4.\nResults: CHATGPT 4 scored 46% on the MCQ exam, whereas the mean and median scores of graduating urology residents were 62.6%, and 62.7%, respectively. This would place CHATGPT's score 1.8 standard deviations from the median. The percentile rank of CHATGPT would be in the sixth percentile. CHATGPT scores on different topics of the exam were as follows: oncology 35%, andrology/benign prostatic hyperplasia 62%, physiology/anatomy 67%, incontinence/female urology 23%, infections 71%, urolithiasis 57%, and trauma/reconstruction 17%, with ChatGPT 4’s oncology performance being significantly below that of postgraduate year 5 residents.\nConclusions: CHATGPT 4 underperforms on an MCQ exam meant to simulate the Canadian board exam. Ongoing assessments of the capability of generative AI is needed as these models evolve and are trained on additional urology content.","PeriodicalId":38001,"journal":{"name":"Canadian Urological Association Journal","volume":" 550","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of artificial intelligence on a simulated Canadian urology board exam\",\"authors\":\"N. Touma, Jessica E. Caterini, Kiera Liblk\",\"doi\":\"10.5489/cuaj.8800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Generative artificial intelligence (AI) has proven to be a powerful tool with increasing applications in clinical care and medical education. CHATGPT has performed adequately on many specialty certification and knowledge assessment exams. The objective of this study was to assess the performance of CHATGPT 4 on a multiple-choice exam meant to simulate the Canadian urology board exam.\\nMethods: Graduating urology residents representing all Canadian training programs gather yearly for a mock exam that simulates their upcoming board-certifying exam. The exam consists of written multiple-choice questions (MCQs) and an oral objective structured clinical examination (OSCE). The 2022 exam was taken by 29 graduating residents and was administered to CHATGPT 4.\\nResults: CHATGPT 4 scored 46% on the MCQ exam, whereas the mean and median scores of graduating urology residents were 62.6%, and 62.7%, respectively. This would place CHATGPT's score 1.8 standard deviations from the median. The percentile rank of CHATGPT would be in the sixth percentile. CHATGPT scores on different topics of the exam were as follows: oncology 35%, andrology/benign prostatic hyperplasia 62%, physiology/anatomy 67%, incontinence/female urology 23%, infections 71%, urolithiasis 57%, and trauma/reconstruction 17%, with ChatGPT 4’s oncology performance being significantly below that of postgraduate year 5 residents.\\nConclusions: CHATGPT 4 underperforms on an MCQ exam meant to simulate the Canadian board exam. Ongoing assessments of the capability of generative AI is needed as these models evolve and are trained on additional urology content.\",\"PeriodicalId\":38001,\"journal\":{\"name\":\"Canadian Urological Association Journal\",\"volume\":\" 550\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canadian Urological Association Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5489/cuaj.8800\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Urological Association Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5489/cuaj.8800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
Performance of artificial intelligence on a simulated Canadian urology board exam
Introduction: Generative artificial intelligence (AI) has proven to be a powerful tool with increasing applications in clinical care and medical education. CHATGPT has performed adequately on many specialty certification and knowledge assessment exams. The objective of this study was to assess the performance of CHATGPT 4 on a multiple-choice exam meant to simulate the Canadian urology board exam.
Methods: Graduating urology residents representing all Canadian training programs gather yearly for a mock exam that simulates their upcoming board-certifying exam. The exam consists of written multiple-choice questions (MCQs) and an oral objective structured clinical examination (OSCE). The 2022 exam was taken by 29 graduating residents and was administered to CHATGPT 4.
Results: CHATGPT 4 scored 46% on the MCQ exam, whereas the mean and median scores of graduating urology residents were 62.6%, and 62.7%, respectively. This would place CHATGPT's score 1.8 standard deviations from the median. The percentile rank of CHATGPT would be in the sixth percentile. CHATGPT scores on different topics of the exam were as follows: oncology 35%, andrology/benign prostatic hyperplasia 62%, physiology/anatomy 67%, incontinence/female urology 23%, infections 71%, urolithiasis 57%, and trauma/reconstruction 17%, with ChatGPT 4’s oncology performance being significantly below that of postgraduate year 5 residents.
Conclusions: CHATGPT 4 underperforms on an MCQ exam meant to simulate the Canadian board exam. Ongoing assessments of the capability of generative AI is needed as these models evolve and are trained on additional urology content.
期刊介绍:
Published by the Canadian Urological Association, the Canadian Urological Association Journal (CUAJ) released its first issue in March 2007, and was published four times that year under the guidance of founding editor (Editor Emeritus as of 2012), Dr. Laurence H. Klotz. In 2008, CUAJ became a bimonthly publication. As of 2013, articles have been published monthly, alternating between print and online-only versions (print issues are available in February, April, June, August, October, and December; online-only issues are produced in January, March, May, July, September, and November). In 2017, the journal launched an ahead-of-print publishing strategy, in which accepted manuscripts are published electronically on our website and cited on PubMed ahead of their official issue-based publication date. By significantly shortening the time to article availability, we offer our readers more flexibility in the way they engage with our content: as a continuous stream, or in a monthly “package,” or both. CUAJ covers a broad range of urological topics — oncology, pediatrics, transplantation, endourology, female urology, infertility, and more. We take pride in showcasing the work of some of Canada’s top investigators and providing our readers with the latest relevant evidence-based research, and on being the primary repository for major guidelines and other important practice recommendations. Our long-term vision is to become an essential destination for urology-based research, education, and advocacy for both physicians and patients, and to act as a springboard for discussions within the urologic community.