The rapid advancement of artificial intelligence (AI) chatbots has generated significant interest regarding their potential applications within medical education. This study sought to assess the performance of the open-source large language model DeepSeek-V3 in answering radiology board-style questions and to compare its accuracy with that of ChatGPT-3.5.A total of 161 questions (comprising 207 items) were randomly selected from the Exercise Book for the National Senior Health Professional Qualification Examination: Radiology. The question set included single-choice, multiple-choice, shared-stem, and case analysis questions. Both DeepSeek-V3 and ChatGPT-3.5 were evaluated using the same question set over a seven-day testing period. Response accuracy was systematically assessed, and statistical analyses were performed using Pearson's chi-square test and Fisher's exact test.DeepSeek-V3 achieved an overall accuracy of 72%, which was significantly higher than the 55.6% accuracy achieved by ChatGPT-3.5 (P < 0.001). Performance analysis by question type revealed DeepSeek's superior accuracy in single-choice questions (87.1%), though with comparatively lower performance in multiple-choice (55.7%) and case analysis questions (68.0%). Across clinical subspecialties, DeepSeek consistently outperformed ChatGPT, particularly in peripheral nervous system (P = 0.003), respiratory system (P = 0.008), circulatory system (P = 0.012), and musculoskeletal system (P = 0.021) domains.In conclusion, DeepSeek demonstrates considerable potential as an educational tool in radiology, particularly for knowledge recall and foundational learning applications. However, its relatively weaker performance on higher-order cognitive tasks and complex question formats suggests the need for further model refinement. Future research should investigate DeepSeek's capability in processing image-based questions and perform comparative analyses with more advanced models (e.g., GPT-5) to better evaluate its potential for medical education.
扫码关注我们
求助内容:
应助结果提醒方式:
