{"title":"ChatGPT-3.5 和 ChatGPT-4o 在日本全国牙科考试中的表现。","authors":"Osamu Uehara, Tetsuro Morikawa, Fumiya Harada, Nodoka Sugiyama, Yuko Matsuki, Daichi Hiraki, Hinako Sakurai, Takashi Kado, Koki Yoshida, Yukie Murata, Hirofumi Matsuoka, Toshiyuki Nagasawa, Yasushi Furuichi, Yoshihiro Abiko, Hiroko Miura","doi":"10.1002/jdd.13766","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>In this study, we compared the performance of ChatGPT-3.5 to that of ChatGPT-4o in the context of the Japanese National Dental Examination, which assesses clinical reasoning skills and dental knowledge, to determine their potential usefulness in dental education.</p><p><strong>Methods: </strong>ChatGPT's performance was assessed using 1399 (55% of the exam) of 2520 questions from the Japanese National Dental Examinations (111-117). The 1121 excluded questions (45% of the exam) contained figures or tables that ChatGPT could not recognize. The questions were categorized into 18 different subjects based on dental specialty. Statistical analysis was performed using SPSS software, with McNemar's test applied to assess differences in performance.</p><p><strong>Results: </strong>A significant improvement was noted in the percentage of correct answers from ChatGPT-4o (84.63%) compared with those from ChatGPT-3.5 (45.46%), demonstrating enhanced reliability and subject knowledge. ChatGPT-4o consistently outperformed ChatGPT-3.5 across all dental subjects, with significant improvements in subjects such as oral surgery, pathology, pharmacology, and microbiology. Heatmap analysis revealed that ChatGPT-4o provided more stable and higher correct answer rates, especially for complex subjects.</p><p><strong>Conclusions: </strong>This study found that advanced natural language processing models, such as ChatGPT-4o, potentially have sufficiently advanced clinical reasoning skills and dental knowledge to function as a supplementary tool in dental education and exam preparation.</p>","PeriodicalId":50216,"journal":{"name":"Journal of Dental Education","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination.\",\"authors\":\"Osamu Uehara, Tetsuro Morikawa, Fumiya Harada, Nodoka Sugiyama, Yuko Matsuki, Daichi Hiraki, Hinako Sakurai, Takashi Kado, Koki Yoshida, Yukie Murata, Hirofumi Matsuoka, Toshiyuki Nagasawa, Yasushi Furuichi, Yoshihiro Abiko, Hiroko Miura\",\"doi\":\"10.1002/jdd.13766\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>In this study, we compared the performance of ChatGPT-3.5 to that of ChatGPT-4o in the context of the Japanese National Dental Examination, which assesses clinical reasoning skills and dental knowledge, to determine their potential usefulness in dental education.</p><p><strong>Methods: </strong>ChatGPT's performance was assessed using 1399 (55% of the exam) of 2520 questions from the Japanese National Dental Examinations (111-117). The 1121 excluded questions (45% of the exam) contained figures or tables that ChatGPT could not recognize. The questions were categorized into 18 different subjects based on dental specialty. Statistical analysis was performed using SPSS software, with McNemar's test applied to assess differences in performance.</p><p><strong>Results: </strong>A significant improvement was noted in the percentage of correct answers from ChatGPT-4o (84.63%) compared with those from ChatGPT-3.5 (45.46%), demonstrating enhanced reliability and subject knowledge. ChatGPT-4o consistently outperformed ChatGPT-3.5 across all dental subjects, with significant improvements in subjects such as oral surgery, pathology, pharmacology, and microbiology. Heatmap analysis revealed that ChatGPT-4o provided more stable and higher correct answer rates, especially for complex subjects.</p><p><strong>Conclusions: </strong>This study found that advanced natural language processing models, such as ChatGPT-4o, potentially have sufficiently advanced clinical reasoning skills and dental knowledge to function as a supplementary tool in dental education and exam preparation.</p>\",\"PeriodicalId\":50216,\"journal\":{\"name\":\"Journal of Dental Education\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Dental Education\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/jdd.13766\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Education","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jdd.13766","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination.
Objectives: In this study, we compared the performance of ChatGPT-3.5 to that of ChatGPT-4o in the context of the Japanese National Dental Examination, which assesses clinical reasoning skills and dental knowledge, to determine their potential usefulness in dental education.
Methods: ChatGPT's performance was assessed using 1399 (55% of the exam) of 2520 questions from the Japanese National Dental Examinations (111-117). The 1121 excluded questions (45% of the exam) contained figures or tables that ChatGPT could not recognize. The questions were categorized into 18 different subjects based on dental specialty. Statistical analysis was performed using SPSS software, with McNemar's test applied to assess differences in performance.
Results: A significant improvement was noted in the percentage of correct answers from ChatGPT-4o (84.63%) compared with those from ChatGPT-3.5 (45.46%), demonstrating enhanced reliability and subject knowledge. ChatGPT-4o consistently outperformed ChatGPT-3.5 across all dental subjects, with significant improvements in subjects such as oral surgery, pathology, pharmacology, and microbiology. Heatmap analysis revealed that ChatGPT-4o provided more stable and higher correct answer rates, especially for complex subjects.
Conclusions: This study found that advanced natural language processing models, such as ChatGPT-4o, potentially have sufficiently advanced clinical reasoning skills and dental knowledge to function as a supplementary tool in dental education and exam preparation.
期刊介绍:
The Journal of Dental Education (JDE) is a peer-reviewed monthly journal that publishes a wide variety of educational and scientific research in dental, allied dental and advanced dental education. Published continuously by the American Dental Education Association since 1936 and internationally recognized as the premier journal for academic dentistry, the JDE publishes articles on such topics as curriculum reform, education research methods, innovative educational and assessment methodologies, faculty development, community-based dental education, student recruitment and admissions, professional and educational ethics, dental education around the world and systematic reviews of educational interest. The JDE is one of the top scholarly journals publishing the most important work in oral health education today; it celebrated its 80th anniversary in 2016.