Background: Clinicians are central to treating tobacco use disorder, yet practical training is inconsistent, and confidence varies. Brief, text message-based microlearning may offer a low-burden way to strengthen foundational competencies in busy clinical settings.
Objective: This paper aims to evaluate whether a short SMS microlearning series improves clinicians' self-reported confidence in managing tobacco use disorder.
Methods: We conducted a single-arm, pre-post educational pilot at an academic medical center. A brief formative survey (13 items; 106 respondents) identified local knowledge gaps and informed message topics and sequencing. The 13-day series delivered 1 concise message per day with key teaching points and links to curated resources. The prespecified primary outcome was self-reported confidence in managing tobacco use disorder (1-100 scale) measured immediately before and after the series. Of the 34 clinicians who signed up, 22 completed the baseline questionnaire and enrolled (attendings: n=4, 18%; trainees: n=18, 82%). Changes in confidence among participants with paired ratings were tested with a paired t test. Engagement with embedded links was recorded.
Results: All enrolled participants completed the 13-day series; none unsubscribed. Postintervention confidence ratings were provided by 18 participants. Mean confidence increased from 60 (SD 16) at baseline to 85 (SD 10) after the series (t17=-10.71; P<.001). Embedded links were opened in 67% (178/266) of messages. Free-text feedback was predominantly positive and emphasized the convenience, clarity, and point-of-care usefulness of brief messages.
Conclusions: A brief SMS microlearning series was associated with a substantial improvement in clinicians' confidence to manage tobacco use disorder, with high completion and evidence of engagement. This low-cost, scalable approach appears practical for busy clinicians. Findings should be interpreted cautiously given the single-arm design, self-selection, and reliance on self-reported confidence rather than objective knowledge or clinical outcomes. Future studies should include a validated knowledge assessment, a randomized comparison, broader sampling, and follow-up to assess durability and impact on care.
{"title":"Text Message (SMS) Microlearning for Tobacco Use Disorder: Pre-Post Pilot Study of Clinician Confidence.","authors":"Zehra Dhanani, Veena Dronamraju, Jamie Garfield","doi":"10.2196/73821","DOIUrl":"10.2196/73821","url":null,"abstract":"<p><strong>Background: </strong>Clinicians are central to treating tobacco use disorder, yet practical training is inconsistent, and confidence varies. Brief, text message-based microlearning may offer a low-burden way to strengthen foundational competencies in busy clinical settings.</p><p><strong>Objective: </strong>This paper aims to evaluate whether a short SMS microlearning series improves clinicians' self-reported confidence in managing tobacco use disorder.</p><p><strong>Methods: </strong>We conducted a single-arm, pre-post educational pilot at an academic medical center. A brief formative survey (13 items; 106 respondents) identified local knowledge gaps and informed message topics and sequencing. The 13-day series delivered 1 concise message per day with key teaching points and links to curated resources. The prespecified primary outcome was self-reported confidence in managing tobacco use disorder (1-100 scale) measured immediately before and after the series. Of the 34 clinicians who signed up, 22 completed the baseline questionnaire and enrolled (attendings: n=4, 18%; trainees: n=18, 82%). Changes in confidence among participants with paired ratings were tested with a paired t test. Engagement with embedded links was recorded.</p><p><strong>Results: </strong>All enrolled participants completed the 13-day series; none unsubscribed. Postintervention confidence ratings were provided by 18 participants. Mean confidence increased from 60 (SD 16) at baseline to 85 (SD 10) after the series (t17=-10.71; P<.001). Embedded links were opened in 67% (178/266) of messages. Free-text feedback was predominantly positive and emphasized the convenience, clarity, and point-of-care usefulness of brief messages.</p><p><strong>Conclusions: </strong>A brief SMS microlearning series was associated with a substantial improvement in clinicians' confidence to manage tobacco use disorder, with high completion and evidence of engagement. This low-cost, scalable approach appears practical for busy clinicians. Findings should be interpreted cautiously given the single-arm design, self-selection, and reliance on self-reported confidence rather than objective knowledge or clinical outcomes. Future studies should include a validated knowledge assessment, a randomized comparison, broader sampling, and follow-up to assess durability and impact on care.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e73821"},"PeriodicalIF":3.2,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12688373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carl Preiksaitis, Joshua Hughes, Rana Kabeer, William Dixon, Christian Rose
<p><strong>Background: </strong>The optimal duration of emergency medicine (EM) residency training remains a subject of national debate, with the Accreditation Council for Graduate Medical Education considering standardizing all programs to 4 years. However, empirical data on how residents accumulate clinical exposure over time are limited. Traditional measures, such as case logs and diagnostic codes, often fail to capture the breadth and depth of diagnostic reasoning. Natural language processing (NLP) of clinical documentation offers a novel approach to quantifying clinical experiences more comprehensively.</p><p><strong>Objective: </strong>This study aimed to (1) quantify how EM residents acquire clinical topic exposure over the course of training, (2) evaluate variation in exposure patterns across residents and classes, and (3) assess changes in workload and case complexity over time to inform the discussion on optimal program length.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study of EM residents at Stanford Hospital, analyzing 244,255 emergency department encounters from July 1, 2016, to November 30, 2023. The sample included 62 residents across 4 graduating classes (2020-2023), representing all primary training site encounters where residents served as primary or supervisory providers. Using a retrieval-augmented generation NLP pipeline, we mapped resident clinical documentation to the 895 subcategories of the 2022 Model for Clinical Practice of Emergency Medicine (MCPEM) via intermediate mapping to the Systematized Nomenclature of Medicine, Clinical Terms, Clinical Observations, Recordings, and Encoding problem list subset. We generated cumulative topic exposure curves, quantified the diversity of topic coverage, assessed variability between residents, and analyzed the progression in clinical complexity using Emergency Severity Index (ESI) scores and admission rates.</p><p><strong>Results: </strong>Residents encountered the largest increase in new topics during postgraduate year 1 (PGY1), averaging 376.7 (42.1%) unique topics among a total of 895 MCPEM subcategories. By PGY4, they averaged 565.9 (63.2%) topics, representing a 9.9% (51/515) increase over PGY3. Exposure plateaus generally occurred at 39 to 41 months, although substantial individual variation was observed, with some residents continuing to acquire new topics until graduation. Annual case volume more than tripled from PGY1 (mean 445.7, SD 112.7 encounters) to PGY4 (mean 1528.4, SD 112.7 encounters). Case complexity increased, as evidenced by a decrease in mean ESI score from 2.94 to 2.79, and a rise in high-acuity (ESI 1-2) cases from 16% (4374/27,340) to 30.9% (9418/30,466).</p><p><strong>Conclusions: </strong>NLP analysis of clinical documentation provides a scalable, detailed method for tracking EM residents' clinical exposure and progression. Many residents continue to gain new experiences into their fourth year, particularly in higher-acuity case
{"title":"Quantifying Emergency Medicine Residency Learning Curves Using Natural Language Processing: Retrospective Cohort Study.","authors":"Carl Preiksaitis, Joshua Hughes, Rana Kabeer, William Dixon, Christian Rose","doi":"10.2196/82326","DOIUrl":"10.2196/82326","url":null,"abstract":"<p><strong>Background: </strong>The optimal duration of emergency medicine (EM) residency training remains a subject of national debate, with the Accreditation Council for Graduate Medical Education considering standardizing all programs to 4 years. However, empirical data on how residents accumulate clinical exposure over time are limited. Traditional measures, such as case logs and diagnostic codes, often fail to capture the breadth and depth of diagnostic reasoning. Natural language processing (NLP) of clinical documentation offers a novel approach to quantifying clinical experiences more comprehensively.</p><p><strong>Objective: </strong>This study aimed to (1) quantify how EM residents acquire clinical topic exposure over the course of training, (2) evaluate variation in exposure patterns across residents and classes, and (3) assess changes in workload and case complexity over time to inform the discussion on optimal program length.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study of EM residents at Stanford Hospital, analyzing 244,255 emergency department encounters from July 1, 2016, to November 30, 2023. The sample included 62 residents across 4 graduating classes (2020-2023), representing all primary training site encounters where residents served as primary or supervisory providers. Using a retrieval-augmented generation NLP pipeline, we mapped resident clinical documentation to the 895 subcategories of the 2022 Model for Clinical Practice of Emergency Medicine (MCPEM) via intermediate mapping to the Systematized Nomenclature of Medicine, Clinical Terms, Clinical Observations, Recordings, and Encoding problem list subset. We generated cumulative topic exposure curves, quantified the diversity of topic coverage, assessed variability between residents, and analyzed the progression in clinical complexity using Emergency Severity Index (ESI) scores and admission rates.</p><p><strong>Results: </strong>Residents encountered the largest increase in new topics during postgraduate year 1 (PGY1), averaging 376.7 (42.1%) unique topics among a total of 895 MCPEM subcategories. By PGY4, they averaged 565.9 (63.2%) topics, representing a 9.9% (51/515) increase over PGY3. Exposure plateaus generally occurred at 39 to 41 months, although substantial individual variation was observed, with some residents continuing to acquire new topics until graduation. Annual case volume more than tripled from PGY1 (mean 445.7, SD 112.7 encounters) to PGY4 (mean 1528.4, SD 112.7 encounters). Case complexity increased, as evidenced by a decrease in mean ESI score from 2.94 to 2.79, and a rise in high-acuity (ESI 1-2) cases from 16% (4374/27,340) to 30.9% (9418/30,466).</p><p><strong>Conclusions: </strong>NLP analysis of clinical documentation provides a scalable, detailed method for tracking EM residents' clinical exposure and progression. Many residents continue to gain new experiences into their fourth year, particularly in higher-acuity case","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e82326"},"PeriodicalIF":3.2,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12688050/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bastien Le Guellec, Victoria Gauthier, Rémi Lenain, Alexandra Nuytten, Luc Dauchet, Brigitte Bonneau, Erwin Gerard, Claire Castandet, Patrick Truffert, Marc Hazzan, Philippe Amouyel, Raphaël Bentegeac, Aghiles Hamroun
<p><strong>Background: </strong>Early exposure to research methodology is essential in medical education, yet many students show limited motivation to engage with nonclinical content. Gamified strategies such as educational escape rooms may help improve engagement, but few studies have explored their feasibility at scale or evaluated their impact beyond student satisfaction.</p><p><strong>Objective: </strong>This study aimed to assess the feasibility, engagement, and perceived educational value of a large-scale escape room specifically designed to introduce third-year medical students to the principles of diagnostic test evaluation.</p><p><strong>Methods: </strong>We developed a low-cost immersive escape room based on a fictional diagnostic accuracy study with 6 puzzles mapped to five predefined learning objectives: (1) identifying key components of a diagnostic study protocol, (2) selecting an appropriate gold standard test, (3) defining a relevant study population, (4) building and interpreting a contingency table, and (5) critically appraising diagnostic metrics in context. The intervention was deployed to an entire class of third-year medical students across 12 sessions between March 2023 and April 2023. Each session included 60 minutes of gameplay and a 45-minute debriefing. Students completed pre- and postintervention questionnaires assessing their knowledge of diagnostic test evaluation and perceptions of research training. Descriptive statistics and 2-tailed paired t tests were used to evaluate score changes; univariate linear regressions assessed associations with demographics. Free-text comments were analyzed using the hierarchical classification by Reinert.</p><p><strong>Results: </strong>Of the 530 participants, 490 (92.5%) completed the full evaluation. Many participants had had limited previous exposure to escape rooms (206/490, 42% had never participated in one), and most (253/490, 51.6%) reported low initial confidence with critical appraisal of scientific articles. Mean overall knowledge scores increased from 62 of 100 (SD 1) before to 82 of 100 (SD 2) after the activity (+32%; P<.001). Gains were observed across all learning objectives and were not influenced by age, sex, or previous experience. Students rated the educational escape room as highly entertaining (mean score 9.1/10, SD 1.1) and educational (mean score 8.2/10, SD 1.5). Following the intervention, 86.9% (393/452) felt more comfortable with critical appraisal of diagnostic test studies, and 79% (357/452) considered the escape room format highly appropriate for an introductory session.</p><p><strong>Conclusions: </strong>This study demonstrates the feasibility and enthusiastic reception of a large-scale, reusable escape room aimed at teaching the fundamental principles of diagnostic test evaluation to undergraduate medical students. This approach may serve as a valuable entry point to engage students with evidence-based reasoning and pave the way for deeper exploration
{"title":"Engaging Undergraduate Medical Students With Introductory Research Training via an Educational Escape Room: Mixed Methods Evaluation.","authors":"Bastien Le Guellec, Victoria Gauthier, Rémi Lenain, Alexandra Nuytten, Luc Dauchet, Brigitte Bonneau, Erwin Gerard, Claire Castandet, Patrick Truffert, Marc Hazzan, Philippe Amouyel, Raphaël Bentegeac, Aghiles Hamroun","doi":"10.2196/71339","DOIUrl":"10.2196/71339","url":null,"abstract":"<p><strong>Background: </strong>Early exposure to research methodology is essential in medical education, yet many students show limited motivation to engage with nonclinical content. Gamified strategies such as educational escape rooms may help improve engagement, but few studies have explored their feasibility at scale or evaluated their impact beyond student satisfaction.</p><p><strong>Objective: </strong>This study aimed to assess the feasibility, engagement, and perceived educational value of a large-scale escape room specifically designed to introduce third-year medical students to the principles of diagnostic test evaluation.</p><p><strong>Methods: </strong>We developed a low-cost immersive escape room based on a fictional diagnostic accuracy study with 6 puzzles mapped to five predefined learning objectives: (1) identifying key components of a diagnostic study protocol, (2) selecting an appropriate gold standard test, (3) defining a relevant study population, (4) building and interpreting a contingency table, and (5) critically appraising diagnostic metrics in context. The intervention was deployed to an entire class of third-year medical students across 12 sessions between March 2023 and April 2023. Each session included 60 minutes of gameplay and a 45-minute debriefing. Students completed pre- and postintervention questionnaires assessing their knowledge of diagnostic test evaluation and perceptions of research training. Descriptive statistics and 2-tailed paired t tests were used to evaluate score changes; univariate linear regressions assessed associations with demographics. Free-text comments were analyzed using the hierarchical classification by Reinert.</p><p><strong>Results: </strong>Of the 530 participants, 490 (92.5%) completed the full evaluation. Many participants had had limited previous exposure to escape rooms (206/490, 42% had never participated in one), and most (253/490, 51.6%) reported low initial confidence with critical appraisal of scientific articles. Mean overall knowledge scores increased from 62 of 100 (SD 1) before to 82 of 100 (SD 2) after the activity (+32%; P<.001). Gains were observed across all learning objectives and were not influenced by age, sex, or previous experience. Students rated the educational escape room as highly entertaining (mean score 9.1/10, SD 1.1) and educational (mean score 8.2/10, SD 1.5). Following the intervention, 86.9% (393/452) felt more comfortable with critical appraisal of diagnostic test studies, and 79% (357/452) considered the escape room format highly appropriate for an introductory session.</p><p><strong>Conclusions: </strong>This study demonstrates the feasibility and enthusiastic reception of a large-scale, reusable escape room aimed at teaching the fundamental principles of diagnostic test evaluation to undergraduate medical students. This approach may serve as a valuable entry point to engage students with evidence-based reasoning and pave the way for deeper exploration","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e71339"},"PeriodicalIF":3.2,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12685230/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145710051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liyuan Xu, Qinrong Xu, Chunya Liu, Baozhen Chen, Chunxia Wang
<p><strong>Background: </strong>Clinical internal medicine practice training traditionally relies on case-based teaching. This approach limits the development of students' clinical thinking skills. It also places significant pressure on instructors. Virtual standardized patients (VSPs) could offer an alternative solution. However, evidence on their feasibility and effectiveness remains limited.</p><p><strong>Objective: </strong>This study aims to use the "VSPs in general practice" interactive diagnostic and teaching system, which uses VSPs to provide 3D virtual simulated patients and mimic virtual clinical scenarios. Medical students are trained through system-preset cases. This study aims to establish the clinical application of VSPs through a "VSPs in general practice" system and compare its effectiveness with traditional teaching in improving students' clinical thinking ability.</p><p><strong>Methods: </strong>A randomized controlled trial was conducted from October 20, 2022, to October 20, 2024. A total of 60 medical students interning at Quzhou People's Hospital were enrolled and divided into 2 groups: the experimental group receiving VSP training (30/60, 50%) and the control group receiving traditional academic training (30/60, 50%). The teaching effectiveness was evaluated using basic knowledge assessments and virtual system scoring. After completing the course, students were surveyed with a questionnaire to assess their satisfaction with the course.</p><p><strong>Results: </strong>All enrolled medical students completed the study. In the evaluation of training effectiveness, the experimental group showed significantly greater improvement in theoretical scores compared to the control group (mean 17.07, SD 4.24 vs mean 10.67, SD 4.91; F1, 59=29.20; Cohen d=1.15; 95% CI 12.43-15.31; P<.001); the total score improvement in the virtual clinical thinking training system test was also significantly better in the experimental group than in the control group (mean 42.60, SD 9.56 vs mean 31.63, SD 7.24; F1, 59=25.10; Cohen d=1.09; 95% CI 34.51-39.72; P<.001). Specifically, improvements in consultation skills (mean 8.76, SD 1.67 vs mean 7.66, SD 2.08; F1, 59=31.09; Cohen d=0.55; 95% CI 7.70-8.70; P<.001), overall objective improvement (mean 11.97, SD 2.77 vs mean 8.15, SD 2.62; F1, 59=30.08; Cohen d=1.16; 95% CI 9.21-10.91; P<.001), initial diagnostic ability (mean 8.74, SD 1.67 vs mean 7.66, SD 2.08; F1, 59=4.91; Cohen d=0.55, 95% CI 7.70-8.70; P=.03), and ability to provide patient treatment (mean 7.23, SD 2.41 vs mean 5.72, SD 2.19; F1, 59=6.42; Cohen d=0.63; 95% CI 5.85-7.01; P=.01) were significantly higher in the experimental group than in the control group. The questionnaire results indicated that 90% (27/30) of the students who participated in the VSPs' training believed it could enhance their clinical thinking abilities.</p><p><strong>Conclusions: </strong>VSPs reinforce the foundational knowledge of internal medicine among medical students
背景:临床内科实践培训传统上依赖于案例教学。这种方法限制了学生临床思维能力的发展。这也给教师带来了巨大的压力。虚拟标准化病人(vsp)可以提供另一种解决方案。然而,关于其可行性和有效性的证据仍然有限。目的:本研究旨在利用“全科VSPs”交互式诊断教学系统,利用VSPs提供三维虚拟模拟患者,模拟虚拟临床场景。医学生通过系统预设的案例进行培训。本研究旨在通过“全科VSPs”体系建立VSPs的临床应用,并比较其与传统教学在提高学生临床思维能力方面的效果。方法:于2022年10月20日至2024年10月20日进行随机对照试验。选取衢州市人民医院实习医学生60名,分为实验组(30/ 60,50%)和对照组(30/ 60,50%)。采用基础知识评估和虚拟系统评分对教学效果进行评价。课程结束后,学生们接受了一份问卷调查,以评估他们对课程的满意度。结果:所有入组的医学生均完成了研究。在培训效果评价方面,实验组的理论成绩较对照组有显著提高(mean 17.07, SD 4.24 vs mean 10.67, SD 4.91); F1, 59=29.20; Cohen d=1.15; 95% CI 12.43 ~ 15.31;结论:VSPs强化了医学生的内科基础知识,增强了他们的临床思维能力,提高了他们独立工作的能力。VSP系统可行、实用、性价比高,值得在临床教育中进一步推广。
{"title":"Virtual Standardized Patients for Improving Clinical Thinking Ability Training in Residents: Randomized Controlled Trial.","authors":"Liyuan Xu, Qinrong Xu, Chunya Liu, Baozhen Chen, Chunxia Wang","doi":"10.2196/73196","DOIUrl":"10.2196/73196","url":null,"abstract":"<p><strong>Background: </strong>Clinical internal medicine practice training traditionally relies on case-based teaching. This approach limits the development of students' clinical thinking skills. It also places significant pressure on instructors. Virtual standardized patients (VSPs) could offer an alternative solution. However, evidence on their feasibility and effectiveness remains limited.</p><p><strong>Objective: </strong>This study aims to use the \"VSPs in general practice\" interactive diagnostic and teaching system, which uses VSPs to provide 3D virtual simulated patients and mimic virtual clinical scenarios. Medical students are trained through system-preset cases. This study aims to establish the clinical application of VSPs through a \"VSPs in general practice\" system and compare its effectiveness with traditional teaching in improving students' clinical thinking ability.</p><p><strong>Methods: </strong>A randomized controlled trial was conducted from October 20, 2022, to October 20, 2024. A total of 60 medical students interning at Quzhou People's Hospital were enrolled and divided into 2 groups: the experimental group receiving VSP training (30/60, 50%) and the control group receiving traditional academic training (30/60, 50%). The teaching effectiveness was evaluated using basic knowledge assessments and virtual system scoring. After completing the course, students were surveyed with a questionnaire to assess their satisfaction with the course.</p><p><strong>Results: </strong>All enrolled medical students completed the study. In the evaluation of training effectiveness, the experimental group showed significantly greater improvement in theoretical scores compared to the control group (mean 17.07, SD 4.24 vs mean 10.67, SD 4.91; F1, 59=29.20; Cohen d=1.15; 95% CI 12.43-15.31; P<.001); the total score improvement in the virtual clinical thinking training system test was also significantly better in the experimental group than in the control group (mean 42.60, SD 9.56 vs mean 31.63, SD 7.24; F1, 59=25.10; Cohen d=1.09; 95% CI 34.51-39.72; P<.001). Specifically, improvements in consultation skills (mean 8.76, SD 1.67 vs mean 7.66, SD 2.08; F1, 59=31.09; Cohen d=0.55; 95% CI 7.70-8.70; P<.001), overall objective improvement (mean 11.97, SD 2.77 vs mean 8.15, SD 2.62; F1, 59=30.08; Cohen d=1.16; 95% CI 9.21-10.91; P<.001), initial diagnostic ability (mean 8.74, SD 1.67 vs mean 7.66, SD 2.08; F1, 59=4.91; Cohen d=0.55, 95% CI 7.70-8.70; P=.03), and ability to provide patient treatment (mean 7.23, SD 2.41 vs mean 5.72, SD 2.19; F1, 59=6.42; Cohen d=0.63; 95% CI 5.85-7.01; P=.01) were significantly higher in the experimental group than in the control group. The questionnaire results indicated that 90% (27/30) of the students who participated in the VSPs' training believed it could enhance their clinical thinking abilities.</p><p><strong>Conclusions: </strong>VSPs reinforce the foundational knowledge of internal medicine among medical students","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e73196"},"PeriodicalIF":3.2,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12685284/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145710011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lujain Aloum, Halah Ibrahim, Senthil Kumar Rajasekaran, Eman Alefishat
Background: Medical education continues to favor didactic lectures as the predominant method of instruction. However, in recent years, there has been a shift toward active learning methodologies such as gamification.
Objective: This study aimed to describe the implementation of 3 open-access, web-based pharmacology games tailored for medical students: Cross DRUGs, Find the DRUG, and DRUGs Escape Room. The study also evaluated the impact of gamification on knowledge retention, student engagement, and learning experience in pharmacology education.
Methods: We used a quasi-experimental design to examine the effects of gamification on knowledge retention by comparing pretest and posttest scores between the gamer and control groups. Each week, students self-selected into either the gamer group or the control group based on personal preference. All students were provided with online access to the same lecture slides. Students in the control group completed both the pretest and posttest but did not play any of the games. A survey was administered to assess students' perceptions of gamification as a learning tool.
Results: Of the 72 students enrolled in the course, 49 (68%) agreed to participate, with 40 (56%) students completing both the pretest and posttest and being included in our analysis. As participation could vary weekly, an individual student might have appeared in both groups across different weeks, resulting in 59 gamer sessions and 20 control sessions. The mean pretest scores were 6.05 (SD 2.31) for the control group and 6.20 (SD 2.13) for the gamer group. The mean posttest scores were 6.90 (SD 2.02) for the control group and 8.47 (SD 1.30) for the gamer group. The gamer group exhibited significantly improved posttest scores (P=.006), while the control group did not (P=.21). Most respondents (25/30, 83%) found the games enjoyable and agreed that the games effectively helped them understand pharmacological concepts (24/30, 80%). Additionally, 70% (21/30) of students believed they learned better from the gaming format than from didactic lectures. Most favored a blended approach that combines lectures with games or case studies.
Conclusions: Gamification can serve as an effective complementary teaching tool for helping medical students learn pharmacological concepts.
{"title":"Open-Access Web-Based Gamification in Pharmacology Education for Medical Students: Quasi-Experimental Study.","authors":"Lujain Aloum, Halah Ibrahim, Senthil Kumar Rajasekaran, Eman Alefishat","doi":"10.2196/73666","DOIUrl":"10.2196/73666","url":null,"abstract":"<p><strong>Background: </strong>Medical education continues to favor didactic lectures as the predominant method of instruction. However, in recent years, there has been a shift toward active learning methodologies such as gamification.</p><p><strong>Objective: </strong>This study aimed to describe the implementation of 3 open-access, web-based pharmacology games tailored for medical students: Cross DRUGs, Find the DRUG, and DRUGs Escape Room. The study also evaluated the impact of gamification on knowledge retention, student engagement, and learning experience in pharmacology education.</p><p><strong>Methods: </strong>We used a quasi-experimental design to examine the effects of gamification on knowledge retention by comparing pretest and posttest scores between the gamer and control groups. Each week, students self-selected into either the gamer group or the control group based on personal preference. All students were provided with online access to the same lecture slides. Students in the control group completed both the pretest and posttest but did not play any of the games. A survey was administered to assess students' perceptions of gamification as a learning tool.</p><p><strong>Results: </strong>Of the 72 students enrolled in the course, 49 (68%) agreed to participate, with 40 (56%) students completing both the pretest and posttest and being included in our analysis. As participation could vary weekly, an individual student might have appeared in both groups across different weeks, resulting in 59 gamer sessions and 20 control sessions. The mean pretest scores were 6.05 (SD 2.31) for the control group and 6.20 (SD 2.13) for the gamer group. The mean posttest scores were 6.90 (SD 2.02) for the control group and 8.47 (SD 1.30) for the gamer group. The gamer group exhibited significantly improved posttest scores (P=.006), while the control group did not (P=.21). Most respondents (25/30, 83%) found the games enjoyable and agreed that the games effectively helped them understand pharmacological concepts (24/30, 80%). Additionally, 70% (21/30) of students believed they learned better from the gaming format than from didactic lectures. Most favored a blended approach that combines lectures with games or case studies.</p><p><strong>Conclusions: </strong>Gamification can serve as an effective complementary teaching tool for helping medical students learn pharmacological concepts.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e73666"},"PeriodicalIF":3.2,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12680091/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145688451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Background: </strong>Cancer immunotherapy represents a transformative advancement in oncology, offering new avenues for treating malignancies by harnessing the immune system. Despite its growing clinical relevance, immunotherapy remains underrepresented in undergraduate medical education, particularly in curricula integrating foundational immunology with clinical application. To address this gap, we developed and implemented a fully online elective for fourth-year medical students focused on core immunology concepts, immunotherapy mechanisms, FDA-approved treatments, immune-related adverse events, and patient-centered therapeutic decision-making.</p><p><strong>Objective: </strong>This study aimed to evaluate the effectiveness of an asynchronous-synchronous online cancer immunotherapy elective in improving medical student knowledge, engagement, and critical-thinking skills. We hypothesized that participation in the elective would be associated with perceived improvements in knowledge and clinical preparedness and inform future strategies for integrating cancer immunotherapy into medical curricula.</p><p><strong>Methods: </strong>We conducted a mixed-methods study with fourth-year medical students enrolled in a two-week elective at a U.S. medical school. The curriculum included a self-paced foundational module, online discussion board, and a capstone oral presentation requiring students to propose a novel immunotherapy approach. Participants completed pre- and post-course quizzes assessing immunotherapy knowledge and an anonymous post-course Likert-scale survey. Quantitative data were summarized descriptively, and Likert responses were reported using medians and interquartile ranges (IQR). Due to the small sample size, unpaired t-tests comparing pre- and post-course quiz averages were underpowered to detect statistically significant differences. Qualitative data were analyzed using inductive thematic analysis with investigator triangulation.</p><p><strong>Results: </strong>A total of 35 students completed the elective, and 20 submitted the post-course survey (response rate: 57%). Across all Likert-scale items, students reported a median response of 5 (Strongly Agree) with IQR values ranging from 0 to 1, indicating uniformly positive perceptions and minimal variability in their evaluation of the course. Descriptively, average post-course quiz scores were higher than pre-course scores, suggesting improved conceptual understanding. Qualitative thematic analysis revealed three major themes: (1) increased confidence engaging with complex immunotherapy mechanisms, (2) appreciation for the flexibility and interactivity afforded by the hybrid asynchronous-synchronous model, and (3) enhanced understanding of the real-world clinical application of immunotherapy across interdisciplinary settings.</p><p><strong>Conclusions: </strong>Descriptive quantitative and qualitative findings suggest that a targeted online cancer immunotherapy elective may enh
{"title":"Implementation and Evaluation of a Cancer Immunotherapy Elective for Medical Students: A Mixed-Methods Descriptive Study.","authors":"Mark Raynor, Rivers Hock, Brandon Godinich, Satish Maharaj, Houriya Ayoubieh, Cynthia Perry, Jessica Chacon","doi":"10.2196/71628","DOIUrl":"10.2196/71628","url":null,"abstract":"<p><strong>Background: </strong>Cancer immunotherapy represents a transformative advancement in oncology, offering new avenues for treating malignancies by harnessing the immune system. Despite its growing clinical relevance, immunotherapy remains underrepresented in undergraduate medical education, particularly in curricula integrating foundational immunology with clinical application. To address this gap, we developed and implemented a fully online elective for fourth-year medical students focused on core immunology concepts, immunotherapy mechanisms, FDA-approved treatments, immune-related adverse events, and patient-centered therapeutic decision-making.</p><p><strong>Objective: </strong>This study aimed to evaluate the effectiveness of an asynchronous-synchronous online cancer immunotherapy elective in improving medical student knowledge, engagement, and critical-thinking skills. We hypothesized that participation in the elective would be associated with perceived improvements in knowledge and clinical preparedness and inform future strategies for integrating cancer immunotherapy into medical curricula.</p><p><strong>Methods: </strong>We conducted a mixed-methods study with fourth-year medical students enrolled in a two-week elective at a U.S. medical school. The curriculum included a self-paced foundational module, online discussion board, and a capstone oral presentation requiring students to propose a novel immunotherapy approach. Participants completed pre- and post-course quizzes assessing immunotherapy knowledge and an anonymous post-course Likert-scale survey. Quantitative data were summarized descriptively, and Likert responses were reported using medians and interquartile ranges (IQR). Due to the small sample size, unpaired t-tests comparing pre- and post-course quiz averages were underpowered to detect statistically significant differences. Qualitative data were analyzed using inductive thematic analysis with investigator triangulation.</p><p><strong>Results: </strong>A total of 35 students completed the elective, and 20 submitted the post-course survey (response rate: 57%). Across all Likert-scale items, students reported a median response of 5 (Strongly Agree) with IQR values ranging from 0 to 1, indicating uniformly positive perceptions and minimal variability in their evaluation of the course. Descriptively, average post-course quiz scores were higher than pre-course scores, suggesting improved conceptual understanding. Qualitative thematic analysis revealed three major themes: (1) increased confidence engaging with complex immunotherapy mechanisms, (2) appreciation for the flexibility and interactivity afforded by the hybrid asynchronous-synchronous model, and (3) enhanced understanding of the real-world clinical application of immunotherapy across interdisciplinary settings.</p><p><strong>Conclusions: </strong>Descriptive quantitative and qualitative findings suggest that a targeted online cancer immunotherapy elective may enh","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12822871/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145678961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trung Anh Nguyen, Thanh Binh Nguyen, Duy Cuong Nguyen, Anh Dung Vu, Khanh Linh Dang, Nhu Quynh Le, Duy Anh Ngo, Dang Kien Nguyen, Van Thuan Hoang, Thanh Binh Ngo
Unlabelled: Artificial intelligence (AI) has the potential to transform medical training through adaptive learning, immersive simulations, automated assessments, and data-driven insights, offering solutions to persistent issues such as high student-to-faculty ratios, overcrowded classrooms, and limited clinical exposure. Globally, many universities have already embedded AI literacy and competencies into undergraduate, postgraduate, and continuing education programs, while in Vietnam, the use of AI in medical education remains limited and fragmented. Most students have little formal exposure to AI, and empirical evidence on faculty or institutional readiness is scarce. Experiences from other countries, including Malaysia, Palestine, and Oman, demonstrate that incremental adoption and faculty development can facilitate cultural acceptance and curricular innovation, providing useful lessons for Vietnam. At the same time, significant barriers remain. These include inadequate infrastructure in provincial universities, low levels of AI literacy among both students and educators, underdeveloped regulatory and ethical frameworks, and resistance to pedagogical change. Cost-effectiveness and sustainability are additional concerns in a middle-income context, where upfront investments must be balanced against long-term benefits and equitable access. Advancing AI in Vietnamese medical education will therefore require a coordinated national strategy that prioritizes infrastructure, AI literacy, faculty development, quality assurance, and sustainable funding models, alongside ethical and legal safeguards. By addressing these key foundations, Vietnam can harness AI not only to modernize medical education but also to strengthen preparedness for a digitally enabled health workforce.
{"title":"What Are the Opportunities and Challenges of Using AI in Medical Education in Vietnam?","authors":"Trung Anh Nguyen, Thanh Binh Nguyen, Duy Cuong Nguyen, Anh Dung Vu, Khanh Linh Dang, Nhu Quynh Le, Duy Anh Ngo, Dang Kien Nguyen, Van Thuan Hoang, Thanh Binh Ngo","doi":"10.2196/77817","DOIUrl":"10.2196/77817","url":null,"abstract":"<p><strong>Unlabelled: </strong>Artificial intelligence (AI) has the potential to transform medical training through adaptive learning, immersive simulations, automated assessments, and data-driven insights, offering solutions to persistent issues such as high student-to-faculty ratios, overcrowded classrooms, and limited clinical exposure. Globally, many universities have already embedded AI literacy and competencies into undergraduate, postgraduate, and continuing education programs, while in Vietnam, the use of AI in medical education remains limited and fragmented. Most students have little formal exposure to AI, and empirical evidence on faculty or institutional readiness is scarce. Experiences from other countries, including Malaysia, Palestine, and Oman, demonstrate that incremental adoption and faculty development can facilitate cultural acceptance and curricular innovation, providing useful lessons for Vietnam. At the same time, significant barriers remain. These include inadequate infrastructure in provincial universities, low levels of AI literacy among both students and educators, underdeveloped regulatory and ethical frameworks, and resistance to pedagogical change. Cost-effectiveness and sustainability are additional concerns in a middle-income context, where upfront investments must be balanced against long-term benefits and equitable access. Advancing AI in Vietnamese medical education will therefore require a coordinated national strategy that prioritizes infrastructure, AI literacy, faculty development, quality assurance, and sustainable funding models, alongside ethical and legal safeguards. By addressing these key foundations, Vietnam can harness AI not only to modernize medical education but also to strengthen preparedness for a digitally enabled health workforce.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e77817"},"PeriodicalIF":3.2,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12671901/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Boris Modrau, Karina Frahm Kirk, Sinan Mouaayad Abdulaimma Said, Carsten Reidies Bjarkam, Lone Sunde, Jacob Bodilsen, Jakob Dal, Jette Kolding Kristensen, Jeppe Emmersen, Mike Bundgaard Astorp, Stig Andersen
Background: The impact of Pass/Fail or Tiered grade assessment for exams in undergraduate medical education has caused much debate, but there is little data to inform decision-making. The increasing number of medical schools transitioned to a Pass/Fail assessment has raised concerns about medical students' academic performance. In 2018, during the undergraduate medical curriculum reform at the Faculty of Medicine, Aalborg University changed some exams from Pass/Fail to Tiered grade and vice versa for other exams. These changes provide an opportunity to evaluate the different assessment forms.
Objective: This study aimed to evaluate medical students' academic performance at the final licensing exam in relation to the exam grading principle.
Methods: This single-center cohort study at Aalborg University Medical School, North Denmark Region, assesses the change from 2-digit Tiered grade to Pass/Fail evaluation and vice versa of undergraduate medical students' exams after the 4th and 5th year clinical training modules from Autumn 2015 through Spring 2023. The primary outcome was (1) the average grades at the final licensing exam and (2) the number of students failing exams during the previous two years.
Results: Among the total of 7634 exams, 7164 4th and 5th year clinical training exams were included in the comparisons, of which 3047 (42.5%) were Pass/Fail exams and 4117 (57.5%) were Tiered grade exams. The frequency of students failing exams was 3.3% (n=101/3047) at Pass/Fail and 1.97% (81/4117) with Tiered grade exams (P<.001). This difference was leveled out when counting the near-failure tiered grade as Fail. Tiered grade exams did not differ between semesters (P=.99) nor show a time trend at the 4th year (P=.66). The final licensing exam grades were unaltered (P=.47).
Conclusions: Contrary to our expectation, Pass/Fail exams exhibited a higher fail rate compared to Tiered grade exams without lowering the final academic performance. These results suggest that a shift from Tiered grades to Pass/Fail assessment redirects the focus from rewarding high performance to ensuring standards are maintained among underperforming students.
{"title":"Pass/Fail Versus Tiered Grades and Academic Performance in Undergraduate Medical Education: Crossover Study.","authors":"Boris Modrau, Karina Frahm Kirk, Sinan Mouaayad Abdulaimma Said, Carsten Reidies Bjarkam, Lone Sunde, Jacob Bodilsen, Jakob Dal, Jette Kolding Kristensen, Jeppe Emmersen, Mike Bundgaard Astorp, Stig Andersen","doi":"10.2196/74975","DOIUrl":"10.2196/74975","url":null,"abstract":"<p><strong>Background: </strong>The impact of Pass/Fail or Tiered grade assessment for exams in undergraduate medical education has caused much debate, but there is little data to inform decision-making. The increasing number of medical schools transitioned to a Pass/Fail assessment has raised concerns about medical students' academic performance. In 2018, during the undergraduate medical curriculum reform at the Faculty of Medicine, Aalborg University changed some exams from Pass/Fail to Tiered grade and vice versa for other exams. These changes provide an opportunity to evaluate the different assessment forms.</p><p><strong>Objective: </strong>This study aimed to evaluate medical students' academic performance at the final licensing exam in relation to the exam grading principle.</p><p><strong>Methods: </strong>This single-center cohort study at Aalborg University Medical School, North Denmark Region, assesses the change from 2-digit Tiered grade to Pass/Fail evaluation and vice versa of undergraduate medical students' exams after the 4th and 5th year clinical training modules from Autumn 2015 through Spring 2023. The primary outcome was (1) the average grades at the final licensing exam and (2) the number of students failing exams during the previous two years.</p><p><strong>Results: </strong>Among the total of 7634 exams, 7164 4th and 5th year clinical training exams were included in the comparisons, of which 3047 (42.5%) were Pass/Fail exams and 4117 (57.5%) were Tiered grade exams. The frequency of students failing exams was 3.3% (n=101/3047) at Pass/Fail and 1.97% (81/4117) with Tiered grade exams (P<.001). This difference was leveled out when counting the near-failure tiered grade as Fail. Tiered grade exams did not differ between semesters (P=.99) nor show a time trend at the 4th year (P=.66). The final licensing exam grades were unaltered (P=.47).</p><p><strong>Conclusions: </strong>Contrary to our expectation, Pass/Fail exams exhibited a higher fail rate compared to Tiered grade exams without lowering the final academic performance. These results suggest that a shift from Tiered grades to Pass/Fail assessment redirects the focus from rewarding high performance to ensuring standards are maintained among underperforming students.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e74975"},"PeriodicalIF":3.2,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12683498/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Large language models (LLMs) offer the potential to improve virtual patient-physician communication and reduce health care professionals' workload. However, limitations in accuracy, outdated knowledge, and safety issues restrict their effective use in real clinical settings. Addressing these challenges is crucial for making LLMs a reliable health care tool.
Objective: This study aimed to evaluate the efficacy of Med-RISE, an information retrieval and augmentation tool, in comparison with baseline LLMs, focusing on enhancing accuracy and safety in medical question answering across diverse clinical domains.
Methods: This comparative study introduces Med-RISE, an enhanced version of a retrieval-augmented generation framework specifically designed to improve question-answering performance across wide-ranging medical domains and diverse disciplines. Med-RISE consists of 4 key steps: query rewriting, information retrieval (providing local and real-time retrieval), summarization, and execution (a fact and safety filter before output). This study integrated Med-RISE with 4 LLMs (GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B) and assessed their performance on 4 multiple-choice medical question datasets: MedQA (US Medical Licensing Examination), PubMedQA (original and revised versions), MedMCQA, and EYE500. Primary outcome measures included answer accuracy and hallucination rates, with hallucinations categorized into factuality (inaccurate information) or faithfulness (inconsistency with instructions) types. This study was conducted between March 2024 and August 2024.
Results: The integration of Med-RISE with each LLM led to a substantial increase in accuracy, with improvements ranging from 9.8% to 16.3% (mean 13%, SD 2.3%) across the 4 datasets. The enhanced accuracy rates were 16.3%, 12.9%, 13%, and 9.8% for GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B, respectively. In addition, Med-RISE effectively reduced hallucinations, with reductions ranging from 11.8% to 18% (mean 15.1%, SD 2.8%), factuality hallucinations decreasing by 13.5%, and faithfulness hallucinations decreasing by 5.8%. The hallucination rate reductions were 17.7%, 12.8%, 18%, and 11.8% for GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B, respectively.
Conclusions: The Med-RISE framework significantly improves the accuracy and reduces the hallucinations of LLMs in medical question answering across benchmark datasets. By providing local and real-time information retrieval and fact and safety filtering, Med-RISE enhances the reliability and interpretability of LLMs in the medical domain, offering a promising tool for clinical practice and decision support.
{"title":"Enhancing Large Language Models for Improved Accuracy and Safety in Medical Question Answering: Comparative Study.","authors":"Dingqiao Wang, Jinguo Ye, Jingni Li, Jiangbo Liang, Qikai Zhang, Qiuling Hu, Caineng Pan, Dongliang Wang, Zhong Liu, Wen Shi, Mengxiang Guo, Fei Li, Wei Du, Ying-Feng Zheng","doi":"10.2196/70190","DOIUrl":"10.2196/70190","url":null,"abstract":"<p><strong>Background: </strong>Large language models (LLMs) offer the potential to improve virtual patient-physician communication and reduce health care professionals' workload. However, limitations in accuracy, outdated knowledge, and safety issues restrict their effective use in real clinical settings. Addressing these challenges is crucial for making LLMs a reliable health care tool.</p><p><strong>Objective: </strong>This study aimed to evaluate the efficacy of Med-RISE, an information retrieval and augmentation tool, in comparison with baseline LLMs, focusing on enhancing accuracy and safety in medical question answering across diverse clinical domains.</p><p><strong>Methods: </strong>This comparative study introduces Med-RISE, an enhanced version of a retrieval-augmented generation framework specifically designed to improve question-answering performance across wide-ranging medical domains and diverse disciplines. Med-RISE consists of 4 key steps: query rewriting, information retrieval (providing local and real-time retrieval), summarization, and execution (a fact and safety filter before output). This study integrated Med-RISE with 4 LLMs (GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B) and assessed their performance on 4 multiple-choice medical question datasets: MedQA (US Medical Licensing Examination), PubMedQA (original and revised versions), MedMCQA, and EYE500. Primary outcome measures included answer accuracy and hallucination rates, with hallucinations categorized into factuality (inaccurate information) or faithfulness (inconsistency with instructions) types. This study was conducted between March 2024 and August 2024.</p><p><strong>Results: </strong>The integration of Med-RISE with each LLM led to a substantial increase in accuracy, with improvements ranging from 9.8% to 16.3% (mean 13%, SD 2.3%) across the 4 datasets. The enhanced accuracy rates were 16.3%, 12.9%, 13%, and 9.8% for GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B, respectively. In addition, Med-RISE effectively reduced hallucinations, with reductions ranging from 11.8% to 18% (mean 15.1%, SD 2.8%), factuality hallucinations decreasing by 13.5%, and faithfulness hallucinations decreasing by 5.8%. The hallucination rate reductions were 17.7%, 12.8%, 18%, and 11.8% for GPT-3.5, GPT-4, Vicuna-13B, and ChatGLM-6B, respectively.</p><p><strong>Conclusions: </strong>The Med-RISE framework significantly improves the accuracy and reduces the hallucinations of LLMs in medical question answering across benchmark datasets. By providing local and real-time information retrieval and fact and safety filtering, Med-RISE enhances the reliability and interpretability of LLMs in the medical domain, offering a promising tool for clinical practice and decision support.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e70190"},"PeriodicalIF":3.2,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12709156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alistair Thorpe, Angelos P Kassianos, Ruth Plackett, Vinodh Krishnamurthy, Maria A Kambouri, Jessica Sheringham
Background: Clinical reasoning is increasingly recognized as an important skill in the diagnosis of common and serious conditions. eCREST (electronic Clinical Reasoning Educational Simulation Tool), a clinical reasoning learning resource, was developed to support medical students to learn clinical reasoning. However, primary care teams now encompass a wider range of professional groups, such as physician assistants (PAs), who also need to develop clinical reasoning during their training. Understanding PAs' clinical reasoning processes is key to judging the transferability of learning resources initially targeted to medical students.
Objective: This exploratory study aimed to measure the processes of clinical reasoning undertaken on eCREST by PA students and compare PAs' reasoning processes with previous data collected on medical students.
Methods: Between 2017 and 2021, PA students and medical students used eCREST to learn clinical reasoning skills in an experimental or learning context. Students undertook 2 simulated cases of patients presenting with lung symptoms. They could ask questions, order bedside tests, and select physical exams during the case to help them form, reflect on, and reconsider diagnostic ideas and management strategies while completing a case. Exploratory analysis was undertaken by comparing students' data gathering, flexibility in diagnosis, and diagnostic ideas between medical and PA students.
Results: In total, 159 medical students and 54 PA students completed the cases. PAs were older (mean 27, SD 7 y vs mean 24, SD 4 y; P<.001) and more likely to be female (43/54, 80% vs 84/159, 53%; P<.001). Medical and PA students were similar in the proportion of essential questions asked (Case 1: mean 70.1 vs mean 73.2; P=.33; Case 2: mean 74.6 vs mean 70.9; P=.27), physical examinations requested (Case 1: mean 54.7 vs mean 54.0; P=.59; Case 2: mean 69.3 vs mean 67.5; P=.59), bedside tests selected (Case 1: mean 74.4 vs mean 83.3; P=.05; Case 2: mean 47.9 vs mean 50.0; P=.69), and number of times they changed their diagnoses (Case 1: mean 2.8 vs mean 2.8; P=.99; Case 2: mean 2.8 vs mean 2.5; P=.81). Both student groups improved in their diagnostic accuracy during the cases.
Conclusions: These results provide suggestive evidence that medical and PA students had similar clinical reasoning styles when using an online training tool to support their diagnostic decision-making.
{"title":"Comparison of Physician Assistant and Medical Students' Clinical Reasoning Processes Using an Online Patient Simulation Tool to Support Clinical Reasoning (eCREST): Mixed Methods Study.","authors":"Alistair Thorpe, Angelos P Kassianos, Ruth Plackett, Vinodh Krishnamurthy, Maria A Kambouri, Jessica Sheringham","doi":"10.2196/68981","DOIUrl":"10.2196/68981","url":null,"abstract":"<p><strong>Background: </strong>Clinical reasoning is increasingly recognized as an important skill in the diagnosis of common and serious conditions. eCREST (electronic Clinical Reasoning Educational Simulation Tool), a clinical reasoning learning resource, was developed to support medical students to learn clinical reasoning. However, primary care teams now encompass a wider range of professional groups, such as physician assistants (PAs), who also need to develop clinical reasoning during their training. Understanding PAs' clinical reasoning processes is key to judging the transferability of learning resources initially targeted to medical students.</p><p><strong>Objective: </strong>This exploratory study aimed to measure the processes of clinical reasoning undertaken on eCREST by PA students and compare PAs' reasoning processes with previous data collected on medical students.</p><p><strong>Methods: </strong>Between 2017 and 2021, PA students and medical students used eCREST to learn clinical reasoning skills in an experimental or learning context. Students undertook 2 simulated cases of patients presenting with lung symptoms. They could ask questions, order bedside tests, and select physical exams during the case to help them form, reflect on, and reconsider diagnostic ideas and management strategies while completing a case. Exploratory analysis was undertaken by comparing students' data gathering, flexibility in diagnosis, and diagnostic ideas between medical and PA students.</p><p><strong>Results: </strong>In total, 159 medical students and 54 PA students completed the cases. PAs were older (mean 27, SD 7 y vs mean 24, SD 4 y; P<.001) and more likely to be female (43/54, 80% vs 84/159, 53%; P<.001). Medical and PA students were similar in the proportion of essential questions asked (Case 1: mean 70.1 vs mean 73.2; P=.33; Case 2: mean 74.6 vs mean 70.9; P=.27), physical examinations requested (Case 1: mean 54.7 vs mean 54.0; P=.59; Case 2: mean 69.3 vs mean 67.5; P=.59), bedside tests selected (Case 1: mean 74.4 vs mean 83.3; P=.05; Case 2: mean 47.9 vs mean 50.0; P=.69), and number of times they changed their diagnoses (Case 1: mean 2.8 vs mean 2.8; P=.99; Case 2: mean 2.8 vs mean 2.5; P=.81). Both student groups improved in their diagnostic accuracy during the cases.</p><p><strong>Conclusions: </strong>These results provide suggestive evidence that medical and PA students had similar clinical reasoning styles when using an online training tool to support their diagnostic decision-making.</p>","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e68981"},"PeriodicalIF":3.2,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670056/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145655925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}