首页 > 最新文献

Journal of Educational Evaluation for Health Professions最新文献

英文 中文
Presidential address 2026: celebrating academic excellence and expanding computer-based testing across health professions. 2026年总统演讲:庆祝学术卓越,在卫生专业领域扩大计算机测试。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-01-09 DOI: 10.3352/jeehp.2026.23.1
Hyunjoo Pai
{"title":"Presidential address 2026: celebrating academic excellence and expanding computer-based testing across health professions.","authors":"Hyunjoo Pai","doi":"10.3352/jeehp.2026.23.1","DOIUrl":"https://doi.org/10.3352/jeehp.2026.23.1","url":null,"abstract":"","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"1"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of reference management software with new artificial intelligence-based tools. 参考管理软件与基于人工智能的新工具的比较。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-01-15 DOI: 10.3352/jeehp.2026.23.2
Jae Gyeong Jin, Seung Gyu Lee, Jea Hyeun Park, Jang Won Han, Jae Young Kim, Jungirl Seok, Jeong-Ju Yoo

Reference management software (RMS) represents a cornerstone of modern academic writing and publishing. For decades, programs such as EndNote, Zotero, and Mendeley have played central roles in facilitating citation organization, bibliography formatting, and collaborative scholarship. Although each platform has introduced unique innovations, persistent limitations remain, particularly with respect to usability, accessibility, and accuracy. In parallel, the rise of generative artificial intelligence has introduced an unprecedented challenge: the inadvertent inclusion of fabricated or incorrect references mistakenly incorporated into manuscripts. This phenomenon has exposed a critical limitation of traditional RMS platforms, namely their inability to verify reference authenticity. Against this backdrop, new solutions have emerged. One such example is CiteWell (https://citewell.org/), an artificial intelligence (AI)-era RMS that introduces several notable innovations, including PubMed-integrated verification, an intuitive interface for new users, customizable journal-specific styles, and multilingual accessibility. This review provides a comprehensive historical overview of RMS, evaluates the strengths and weaknesses of major platforms, and positions emerging AI-based tools as a new paradigm that combines traditional reference management with essential safeguards for contemporary academic challenges.

参考文献管理软件(RMS)是现代学术写作和出版的基石。几十年来,EndNote、Zotero和Mendeley等程序在促进引文组织、书目格式和协作奖学金方面发挥了核心作用。尽管每个平台都引入了独特的创新,但仍然存在持续的限制,特别是在可用性、可访问性和准确性方面。与此同时,生成式人工智能的兴起带来了前所未有的挑战:无意中包含了错误地纳入手稿的捏造或不正确的参考文献。这种现象暴露了传统RMS平台的一个关键局限性,即无法验证参考真实性。在此背景下,出现了新的解决办法。其中一个例子是CiteWell (https://citewell.org/),这是一个人工智能(AI)时代的RMS,它引入了几个值得注意的创新,包括pubmed集成验证,为新用户提供的直观界面,可定制的期刊特定样式以及多语言可访问性。本综述提供了RMS的全面历史概述,评估了主要平台的优势和劣势,并将新兴的基于人工智能的工具定位为将传统参考管理与当代学术挑战的基本保障相结合的新范式。
{"title":"Comparison of reference management software with new artificial intelligence-based tools.","authors":"Jae Gyeong Jin, Seung Gyu Lee, Jea Hyeun Park, Jang Won Han, Jae Young Kim, Jungirl Seok, Jeong-Ju Yoo","doi":"10.3352/jeehp.2026.23.2","DOIUrl":"10.3352/jeehp.2026.23.2","url":null,"abstract":"<p><p>Reference management software (RMS) represents a cornerstone of modern academic writing and publishing. For decades, programs such as EndNote, Zotero, and Mendeley have played central roles in facilitating citation organization, bibliography formatting, and collaborative scholarship. Although each platform has introduced unique innovations, persistent limitations remain, particularly with respect to usability, accessibility, and accuracy. In parallel, the rise of generative artificial intelligence has introduced an unprecedented challenge: the inadvertent inclusion of fabricated or incorrect references mistakenly incorporated into manuscripts. This phenomenon has exposed a critical limitation of traditional RMS platforms, namely their inability to verify reference authenticity. Against this backdrop, new solutions have emerged. One such example is CiteWell (https://citewell.org/), an artificial intelligence (AI)-era RMS that introduces several notable innovations, including PubMed-integrated verification, an intuitive interface for new users, customizable journal-specific styles, and multilingual accessibility. This review provides a comprehensive historical overview of RMS, evaluates the strengths and weaknesses of major platforms, and positions emerging AI-based tools as a new paradigm that combines traditional reference management with essential safeguards for contemporary academic challenges.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"2"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accuracy of ChatGPT in answering cardiology board-style questions. ChatGPT在回答心脏病学板式问题中的准确性。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-02-27 DOI: 10.3352/jeehp.2025.22.9
Albert Andrew
{"title":"Accuracy of ChatGPT in answering cardiology board-style questions.","authors":"Albert Andrew","doi":"10.3352/jeehp.2025.22.9","DOIUrl":"10.3352/jeehp.2025.22.9","url":null,"abstract":"","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"9"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12042102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143517011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The role of large language models in the peer-review process: opportunities and challenges for medical journal reviewers and editors. 大型语言模型在同行评审过程中的作用:医学期刊审稿人和编辑的机遇和挑战。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-01-16 DOI: 10.3352/jeehp.2025.22.4
Jisoo Lee, Jieun Lee, Jeong-Ju Yoo

The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.

同行评议过程确保了科学研究的完整性。这在医学领域尤其重要,因为研究结果直接影响到病人的护理。然而,出版物的快速增长使审稿人感到紧张,导致延迟和潜在的质量下降。生成式人工智能,尤其是像ChatGPT这样的大型语言模型(llm),可以帮助研究人员进行高效、高质量的评论。这篇综述探讨了法学硕士与同行评审的整合,强调了他们在语言任务中的优势和评估科学有效性的挑战,特别是在临床医学中。集成的关键点包括初始筛选、审稿人匹配、反馈支持和语言审查。然而,为这些目的实施法学硕士将需要解决偏见、隐私问题和数据机密性问题。我们建议在明确的指导方针下使用法学硕士作为补充工具,以支持而不是取代维护严格的同行评审标准的人类专业知识。
{"title":"The role of large language models in the peer-review process: opportunities and challenges for medical journal reviewers and editors.","authors":"Jisoo Lee, Jieun Lee, Jeong-Ju Yoo","doi":"10.3352/jeehp.2025.22.4","DOIUrl":"10.3352/jeehp.2025.22.4","url":null,"abstract":"<p><p>The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"4"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11952698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143693856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study. 大型语言模型在泰国国家医疗执照考试中的表现:一项横断面研究。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-05-12 DOI: 10.3352/jeehp.2025.22.16
Prut Saowaprut, Romen Samuel Wabina, Junwei Yang, Lertboon Siriwat

Purpose: This study aimed to evaluate the feasibility of general-purpose large language models (LLMs) in addressing inequities in medical licensure exam preparation for Thailand’s National Medical Licensing Examination (ThaiNLE), which currently lacks standardized public study materials.

Methods: We assessed 4 multi-modal LLMs (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro) using a 304-question ThaiNLE Step 1 mock examination (10.2% image-based), applying deterministic API configurations and 5 inference repetitions per model. Performance was measured via micro- and macro-accuracy metrics compared against historical passing thresholds.

Results: All models exceeded passing scores, with GPT-4 achieving the highest accuracy (88.9%; 95% confidence interval, 88.7–89.1), surpassing Thailand’s national average by more than 2 standard deviations. Claude 3.5 Sonnet (80.1%) and Gemini 1.5 Pro (72.8%) followed hierarchically. Models demonstrated robustness across 17 of 20 medical domains, but variability was noted in genetics (74.0%) and cardiovascular topics (58.3%). While models demonstrated proficiency with images (Gemini 1.0 Pro: +9.9% vs. text), text-only accuracy remained superior (GPT4o: 90.0% vs. 82.6%).

Conclusion: General-purpose LLMs show promise as equitable preparatory tools for ThaiNLE Step 1. However, domain-specific knowledge gaps and inconsistent multi-modal integration warrant refinement before clinical deployment.

目的:本研究旨在评估通用大型语言模型(llm)在解决泰国国家医疗执照考试(ThaiNLE)的医疗执照考试准备不公平问题方面的可行性,该考试目前缺乏标准化的公共学习材料。方法:我们使用304个问题的ThaiNLE Step 1模拟考试(10.2%基于图像)评估了4个多模态llm (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro),应用确定性API配置和每个模型5次推理重复。性能通过与历史通过阈值进行比较的微观和宏观精度度量来衡量。结果:所有模型均超过及格分数,其中GPT-4准确率最高(88.9%;95%置信区间为88.7-89.1),超过泰国全国平均水平2个标准差以上。克劳德3.5十四行诗(80.1%)和双子座1.5 Pro(72.8%)紧随其后。模型在20个医学领域中的17个领域表现出稳健性,但在遗传学(74.0%)和心血管主题(58.3%)方面存在差异。虽然模型显示了对图像的熟练程度(Gemini 1.0 Pro: +9.9% vs.文本),但纯文本的准确率仍然更高(gpt - 40: 90.0% vs. 82.6%)。结论:通用法学硕士有望成为ThaiNLE第一步的公平准备工具。然而,领域特定的知识差距和不一致的多模式集成需要在临床部署之前进行改进。
{"title":"Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study.","authors":"Prut Saowaprut, Romen Samuel Wabina, Junwei Yang, Lertboon Siriwat","doi":"10.3352/jeehp.2025.22.16","DOIUrl":"10.3352/jeehp.2025.22.16","url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to evaluate the feasibility of general-purpose large language models (LLMs) in addressing inequities in medical licensure exam preparation for Thailand’s National Medical Licensing Examination (ThaiNLE), which currently lacks standardized public study materials.</p><p><strong>Methods: </strong>We assessed 4 multi-modal LLMs (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro) using a 304-question ThaiNLE Step 1 mock examination (10.2% image-based), applying deterministic API configurations and 5 inference repetitions per model. Performance was measured via micro- and macro-accuracy metrics compared against historical passing thresholds.</p><p><strong>Results: </strong>All models exceeded passing scores, with GPT-4 achieving the highest accuracy (88.9%; 95% confidence interval, 88.7–89.1), surpassing Thailand’s national average by more than 2 standard deviations. Claude 3.5 Sonnet (80.1%) and Gemini 1.5 Pro (72.8%) followed hierarchically. Models demonstrated robustness across 17 of 20 medical domains, but variability was noted in genetics (74.0%) and cardiovascular topics (58.3%). While models demonstrated proficiency with images (Gemini 1.0 Pro: +9.9% vs. text), text-only accuracy remained superior (GPT4o: 90.0% vs. 82.6%).</p><p><strong>Conclusion: </strong>General-purpose LLMs show promise as equitable preparatory tools for ThaiNLE Step 1. However, domain-specific knowledge gaps and inconsistent multi-modal integration warrant refinement before clinical deployment.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"16"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143986836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effectiveness of interprofessional education enhanced by live consultation observations for healthcare students and new professionals in Singapore: a retrospective cross-sectional study. 通过对新加坡卫生保健学生和新专业人员的现场咨询观察提高跨专业教育的有效性:一项回顾性横断面研究。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-08-21 DOI: 10.3352/jeehp.2025.22.21
Lynette Mei Lim Goh, Wai Leong Chiu, Sky Wei Chee Koh

This study aims to evaluate whether incorporating live consultation observations into interprofessional education (IPE) improves learning evaluation scores among healthcare professionals and students. A retrospective cross-sectional analysis was conducted using evaluation data from AHP IPE sessions held from January 2020 to December 2023 across 7 primary care clinics in Singapore. Evaluation scores were compared between sessions with facilitated discussions only (n=667) and sessions with additional live consultation observations (n=501). Logistic regression was used to analyze factors associated with perfect evaluation scores. Sessions that included live consultations were significantly more likely to achieve perfect evaluation scores (odds ratio [OR], 1.68; 95% confidence interval [CI], 1.27-2.22). Nursing/care coordinator and allied health professions (OR 2.07 and 1.76 respectively) were significantly more likely to give perfect scores compared to medical professions. Healthcare professionals were also more likely to give perfect scores than students (OR, 1.52; 95% CI,1.08-2.14), indicating enhanced perceived effectiveness. These findings support the use of experiential learning strategies to optimize interprofessional training outcomes.

本研究旨在评估将现场咨询观察纳入跨专业教育(IPE)是否能提高医疗保健专业人员和学生的学习评估分数。利用2020年1月至2023年12月在新加坡7家初级保健诊所举行的AHP IPE会议的评估数据进行了回顾性横断面分析。评估分数在仅进行促进讨论的会议(n=667)和附加现场咨询观察的会议(n=501)之间进行比较。采用Logistic回归分析与完美评价得分相关的因素。包括现场咨询的会议更有可能获得完美的评估分数(优势比[OR], 1.68; 95%可信区间[CI], 1.27-2.22)。护理/护理协调员和联合卫生专业(OR分别为2.07和1.76)比医疗专业更有可能给出完美的分数。医疗保健专业人员也比学生更有可能给出完美的分数(OR, 1.52; 95% CI,1.08-2.14),表明感知有效性增强。这些发现支持使用体验式学习策略来优化跨专业培训结果。
{"title":"Effectiveness of interprofessional education enhanced by live consultation observations for healthcare students and new professionals in Singapore: a retrospective cross-sectional study.","authors":"Lynette Mei Lim Goh, Wai Leong Chiu, Sky Wei Chee Koh","doi":"10.3352/jeehp.2025.22.21","DOIUrl":"10.3352/jeehp.2025.22.21","url":null,"abstract":"<p><p>This study aims to evaluate whether incorporating live consultation observations into interprofessional education (IPE) improves learning evaluation scores among healthcare professionals and students. A retrospective cross-sectional analysis was conducted using evaluation data from AHP IPE sessions held from January 2020 to December 2023 across 7 primary care clinics in Singapore. Evaluation scores were compared between sessions with facilitated discussions only (n=667) and sessions with additional live consultation observations (n=501). Logistic regression was used to analyze factors associated with perfect evaluation scores. Sessions that included live consultations were significantly more likely to achieve perfect evaluation scores (odds ratio [OR], 1.68; 95% confidence interval [CI], 1.27-2.22). Nursing/care coordinator and allied health professions (OR 2.07 and 1.76 respectively) were significantly more likely to give perfect scores compared to medical professions. Healthcare professionals were also more likely to give perfect scores than students (OR, 1.52; 95% CI,1.08-2.14), indicating enhanced perceived effectiveness. These findings support the use of experiential learning strategies to optimize interprofessional training outcomes.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"21"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12768549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145534594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The impact of artificial intelligence-driven simulation on the development of non-technical skills in medical education: a systematic review. 人工智能驱动的模拟对医学教育中非技术技能发展的影响:系统综述。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-11-24 DOI: 10.3352/jeehp.2025.22.37
Sana Loubbairi, Yasmine El Moussaoui, Laila Lahlou, Imad Chakri, Hicham Nassik

Purpose: Artificial intelligence (AI)-driven simulation is an emerging approach in healthcare education that enhances learning effectiveness. This review examined its impact on the development of non-technical skills among medical learners.

Methods: Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a systematic review was conducted using the following databases: Web of Science, ScienceDirect, Scopus, and PubMed. The quality of the included studies was assessed using the Mixed.

Methods: Appraisal Tool. The protocol was previously registered in PROSPERO (CRD420251038024).

Results: Of the 1,442 studies identified in the initial search, 20 met the inclusion criteria, involving 2,535 participants. The simulators varied considerably, ranging from platforms built on symbolic AI methods to social robots powered by computational AI. Among the 15 AI-driven simulators, 10 used ChatGPT or its variants as virtual patients. Several studies evaluated multiple non-technical skills simultaneously. Communication and clinical reasoning were the most frequently assessed skills, appearing in 12 and 6 studies, respectively, which generally reported positive outcomes. Improvements were also noted in decision-making, empathy, self-confidence, critical thinking, and problem-solving. In contrast, emotional regulation, assessed in a single study, showed no significant difference. Notably, none of the studies examined reflection, reflective practice, teamwork, or leadership.

Conclusion: AI-driven simulation shows substantial potential for enhancing non-technical skills in medical education, particularly communication and clinical reasoning. However, its effects on several other non-technical skills remain unclear. Given heterogeneity in study designs and outcome measures, these findings should be interpreted cautiously. These considerations highlight the need for further research to support integrating this innovative approach into medical curricula.

目的:人工智能(AI)驱动的模拟是一种新兴的医疗保健教育方法,可以提高学习效率。本综述探讨了其对医学学习者非技术技能发展的影响。方法:遵循PRISMA(系统评价和荟萃分析的首选报告项目)指南,使用以下数据库进行系统评价:Web of Science、ScienceDirect、Scopus和PubMed。纳入研究的质量用Mixed进行评估。方法:评价工具。该协议先前在PROSPERO (CRD420251038024)中注册。结果:在最初的检索中确定的1442项研究中,有20项符合纳入标准,涉及2,535名参与者。模拟器变化很大,从基于符号人工智能方法的平台到由计算人工智能驱动的社交机器人。在15个人工智能模拟器中,有10个使用ChatGPT或其变体作为虚拟患者。一些研究同时评估了多种非技术技能。沟通和临床推理是最常被评估的技能,分别出现在12项和6项研究中,这些研究通常报告了积极的结果。在决策、同理心、自信、批判性思维和解决问题方面也有所改善。相比之下,在一项研究中评估的情绪调节没有显示出显著差异。值得注意的是,这些研究都没有考察反思、反思实践、团队合作或领导力。结论:人工智能驱动的模拟显示出增强医学教育非技术技能的巨大潜力,特别是沟通和临床推理。然而,它对其他一些非技术技能的影响尚不清楚。考虑到研究设计和结果测量的异质性,这些发现应谨慎解释。这些考虑突出表明需要进一步研究以支持将这一创新方法纳入医学课程。
{"title":"The impact of artificial intelligence-driven simulation on the development of non-technical skills in medical education: a systematic review.","authors":"Sana Loubbairi, Yasmine El Moussaoui, Laila Lahlou, Imad Chakri, Hicham Nassik","doi":"10.3352/jeehp.2025.22.37","DOIUrl":"https://doi.org/10.3352/jeehp.2025.22.37","url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI)-driven simulation is an emerging approach in healthcare education that enhances learning effectiveness. This review examined its impact on the development of non-technical skills among medical learners.</p><p><strong>Methods: </strong>Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a systematic review was conducted using the following databases: Web of Science, ScienceDirect, Scopus, and PubMed. The quality of the included studies was assessed using the Mixed.</p><p><strong>Methods: </strong>Appraisal Tool. The protocol was previously registered in PROSPERO (CRD420251038024).</p><p><strong>Results: </strong>Of the 1,442 studies identified in the initial search, 20 met the inclusion criteria, involving 2,535 participants. The simulators varied considerably, ranging from platforms built on symbolic AI methods to social robots powered by computational AI. Among the 15 AI-driven simulators, 10 used ChatGPT or its variants as virtual patients. Several studies evaluated multiple non-technical skills simultaneously. Communication and clinical reasoning were the most frequently assessed skills, appearing in 12 and 6 studies, respectively, which generally reported positive outcomes. Improvements were also noted in decision-making, empathy, self-confidence, critical thinking, and problem-solving. In contrast, emotional regulation, assessed in a single study, showed no significant difference. Notably, none of the studies examined reflection, reflective practice, teamwork, or leadership.</p><p><strong>Conclusion: </strong>AI-driven simulation shows substantial potential for enhancing non-technical skills in medical education, particularly communication and clinical reasoning. However, its effects on several other non-technical skills remain unclear. Given heterogeneity in study designs and outcome measures, these findings should be interpreted cautiously. These considerations highlight the need for further research to support integrating this innovative approach into medical curricula.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"37"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146020145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A nationwide survey on the curriculum and educational resources related to the Clinical Skills Test of the Korean Medical Licensing Examination: a cross-sectional descriptive study. 韩国医师执照考试临床技能测试相关课程和教育资源的全国性调查:横断面描述性研究。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-03-13 DOI: 10.3352/jeehp.2025.22.11
Eun-Kyung Chung, Seok Hoon Kang, Do-Hoon Kim, MinJeong Kim, Ji-Hyun Seo, Keunmi Lee, Eui-Ryoung Han

Purpose: The revised Clinical Skills Test (CST) of the Korean Medical Licensing Exam aims to provide a better assessment of physicians’ clinical competence and ability to interact with patients. This study examined the impact of the revised CST on medical education curricula and resources nationwide, while also identifying areas for improvement within the revised CST.

Methods: This study surveyed faculty responsible for clinical clerkships at 40 medical schools throughout Korea to evaluate the status and changes in clinical skills education, assessment, and resources related to the CST. The researchers distributed the survey via email through regional consortia between December 7, 2023 and January 19, 2024.

Results: Nearly all schools implemented preliminary student–patient encounters during core clinical rotations. Schools primarily conducted clinical skills assessments in the third and fourth years, with a simplified form introduced in the first and second years. Remedial education was conducted through various methods, including oneon-one feedback from faculty after the assessment. All schools established clinical skills centers and made ongoing improvements. Faculty members did not perceive the CST revisions as significantly altering clinical clerkship or skills assessments. They suggested several improvements, including assessing patient records to improve accuracy and increasing the objectivity of standardized patient assessments to ensure fairness.

Conclusion: During the CST, students’ involvement in patient encounters and clinical skills education increased, improving the assessment and feedback processes for clinical skills within the curriculum. To enhance students’ clinical competencies and readiness, strengthening the validity and reliability of the CST is essential.

目的:修订后的韩国医师执照考试临床技能测试(CST)旨在更好地评估医生的临床能力和与患者互动的能力。本研究审查了修订后的《国家卫生法》对全国医学教育课程和资源的影响,同时也确定了修订后的《国家卫生法》中有待改进的领域。方法:本研究以全国40所医学院临床见习教师为调查对象,评估临床见习教育、考核及相关资源的现状及变化。研究人员在2023年12月7日至2024年1月19日期间通过区域联盟通过电子邮件分发了这项调查。结果:几乎所有学校都在核心临床轮转期间实施了初步的学生-患者接触。学校主要在第三和第四年进行临床技能评估,在第一和第二年引入简化形式。通过多种方式进行补救教育,包括评估后教师一对一的反馈。所有学校都建立了临床技能中心,并不断改进。教师们并不认为CST的修订会显著改变临床实习或技能评估。他们提出了几项改进建议,包括评估患者记录以提高准确性,增加标准化患者评估的客观性以确保公平性。结论:在CST期间,学生参与病人接触和临床技能教育的次数增加,改善了课程中临床技能的评估和反馈过程。为了提高学生的临床能力和准备,加强CST的效度和可靠性是必不可少的。
{"title":"A nationwide survey on the curriculum and educational resources related to the Clinical Skills Test of the Korean Medical Licensing Examination: a cross-sectional descriptive study.","authors":"Eun-Kyung Chung, Seok Hoon Kang, Do-Hoon Kim, MinJeong Kim, Ji-Hyun Seo, Keunmi Lee, Eui-Ryoung Han","doi":"10.3352/jeehp.2025.22.11","DOIUrl":"10.3352/jeehp.2025.22.11","url":null,"abstract":"<p><strong>Purpose: </strong>The revised Clinical Skills Test (CST) of the Korean Medical Licensing Exam aims to provide a better assessment of physicians’ clinical competence and ability to interact with patients. This study examined the impact of the revised CST on medical education curricula and resources nationwide, while also identifying areas for improvement within the revised CST.</p><p><strong>Methods: </strong>This study surveyed faculty responsible for clinical clerkships at 40 medical schools throughout Korea to evaluate the status and changes in clinical skills education, assessment, and resources related to the CST. The researchers distributed the survey via email through regional consortia between December 7, 2023 and January 19, 2024.</p><p><strong>Results: </strong>Nearly all schools implemented preliminary student–patient encounters during core clinical rotations. Schools primarily conducted clinical skills assessments in the third and fourth years, with a simplified form introduced in the first and second years. Remedial education was conducted through various methods, including oneon-one feedback from faculty after the assessment. All schools established clinical skills centers and made ongoing improvements. Faculty members did not perceive the CST revisions as significantly altering clinical clerkship or skills assessments. They suggested several improvements, including assessing patient records to improve accuracy and increasing the objectivity of standardized patient assessments to ensure fairness.</p><p><strong>Conclusion: </strong>During the CST, students’ involvement in patient encounters and clinical skills education increased, improving the assessment and feedback processes for clinical skills within the curriculum. To enhance students’ clinical competencies and readiness, strengthening the validity and reliability of the CST is essential.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"11"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12042100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143617568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiotorax.es: a web-based tool for formative self-assessment in chest X-ray interpretation. 一个基于网络的胸片解释形成性自我评估工具。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-06-09 DOI: 10.3352/jeehp.2025.22.17
Verónica Illescas-Megías, Jorge Manuel Maqueda-Pérez, Dolores Domínguez-Pinos, Teodoro Rudolphi Solero, Francisco Sendra-Portero

Radiotorax.es is a free, non-profit web-based tool designed to support formative self-assessment in chest X-ray interpretation. This article presents its structure, educational applications, and usage data from 11 years of continuous operation. Users complete interpretation rounds of 20 clinical cases, compare their reports with expert evaluations, and conduct a structured self-assessment. From 2011 to 2022, 14,389 users registered, and 7,726 completed at least one session. Most were medical students (75.8%), followed by residents (15.2%) and practicing physicians (9.0%). The platform has been integrated into undergraduate medical curricula and used in various educational contexts, including tutorials, peer and expert review, and longitudinal tracking. Its flexible design supports self-directed learning, instructor-guided use, and multicenter research. As a freely accessible resource based on real clinical cases, Radiotorax.es provides a scalable, realistic, and well-received training environment that promotes diagnostic skill development, reflection, and educational innovation in radiology education.

是一个免费的,非营利性的基于网络的工具,旨在支持形成性自我评估在胸部x射线解释。本文介绍了它的结构、教育应用和连续运行11年的使用数据。用户完成20个临床病例的解读轮,将他们的报告与专家评估进行比较,并进行结构化的自我评估。从2011年到2022年,有14389名用户注册,7726名用户至少完成了一次会话。以医学生(75.8%)居多,其次是住院医师(15.2%)和执业医师(9.0%)。该平台已被整合到本科医学课程中,并用于各种教育环境,包括教程、同行和专家评审以及纵向跟踪。其灵活的设计支持自主学习、教师指导使用和多中心研究。作为一个基于真实临床病例的免费资源,raditorax .es提供了一个可扩展的、真实的、受欢迎的培训环境,促进了放射学教育中诊断技能的发展、反思和教育创新。
{"title":"Radiotorax.es: a web-based tool for formative self-assessment in chest X-ray interpretation.","authors":"Verónica Illescas-Megías, Jorge Manuel Maqueda-Pérez, Dolores Domínguez-Pinos, Teodoro Rudolphi Solero, Francisco Sendra-Portero","doi":"10.3352/jeehp.2025.22.17","DOIUrl":"10.3352/jeehp.2025.22.17","url":null,"abstract":"<p><p>Radiotorax.es is a free, non-profit web-based tool designed to support formative self-assessment in chest X-ray interpretation. This article presents its structure, educational applications, and usage data from 11 years of continuous operation. Users complete interpretation rounds of 20 clinical cases, compare their reports with expert evaluations, and conduct a structured self-assessment. From 2011 to 2022, 14,389 users registered, and 7,726 completed at least one session. Most were medical students (75.8%), followed by residents (15.2%) and practicing physicians (9.0%). The platform has been integrated into undergraduate medical curricula and used in various educational contexts, including tutorials, peer and expert review, and longitudinal tracking. Its flexible design supports self-directed learning, instructor-guided use, and multicenter research. As a freely accessible resource based on real clinical cases, Radiotorax.es provides a scalable, realistic, and well-received training environment that promotes diagnostic skill development, reflection, and educational innovation in radiology education.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"17"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144250213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of applying computerized adaptive testing to the Clinical Medical Science Comprehensive Examination in Korea: a psychometric study. 韩国临床医学综合考试应用计算机化适应性测验的可行性:一项心理测量学研究。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-10-01 DOI: 10.3352/jeehp.2025.22.29
Jeongwook Choi, Sung-Soo Jung, Eun Kwang Choi, Kyung Sik Kim, Dong Gi Seo

Purpose: This study aimed to investigate the feasibility of transitioning the Clinical Medical Science Comprehensive Examination (CMSCE) to computerized adaptive testing (CAT) in Korea, thereby providing greater opportunities for medical students to accurately compare their clinical competencies with peers nationwide and to monitor their own progress.

Methods: A medical self-assessment using CAT was conducted from March to June 2023, involving 1,541 medical students who volunteered from 40 medical colleges in Korea. An item bank consisting of 1,145 items from previously administered CMSCE examinations (2019-2021) hosted by the Medical Education Assessment Corporation was established. Items were selected through 2-stage filtering, based on classical test theory (discrimination index above 0.15) and item response theory (discrimination parameter estimates above 0.6 and difficulty parameter estimates between -5 and +5). Maximum Fisher information was employed as the item selection method, and maximum likelihood estimation was used for ability estimation.

Results: The CAT was successfully administered without significant issues. The stopping rule was set at a standard error of measurement of 0.25, with a maximum of 50 items for ability estimation. The mean ability score was 0.55, with an average of 28 items administered per student. Students at extreme ability levels reached the maximum of 50 items due to the limited availability of items at appropriate difficulty levels.

Conclusion: The medical self-assessment CAT, the first of its kind in Korea, was successfully implemented nationwide without significant problems. These results indicate strong potential for expanding the use of CAT in medical education assessments.

目的:本研究旨在探讨韩国临床医学综合考试(CMSCE)向计算机化适应性测试(CAT)过渡的可行性,从而为医学生提供更多的机会,准确地与全国同行比较他们的临床能力,并监测自己的进步。方法:于2023年3月至6月对来自韩国40所医学院的1541名志愿医学生进行CAT医学自我评估。建立了一个由医学教育评估公司主办的由以前举办的CMSCE考试(2019-2021)的1145个项目组成的题库。根据经典测试理论(辨别指数大于0.15)和项目反应理论(辨别参数估计值大于0.6,难度参数估计值在-5和+5之间),通过两阶段过滤选择项目。项目选择采用最大Fisher信息,能力估计采用最大似然估计。结果:CAT给药成功,无明显问题。停止规则设置为0.25的测量标准误差,最多50个项目用于能力估计。平均能力得分为0.55,平均每个学生被管理28个项目。由于适当难度的项目有限,极端能力水平的学生最多只能完成50个项目。结论:在全国范围内成功实施了国内第一个医疗自我评估CAT,没有出现重大问题。这些结果表明在医学教育评估中扩大CAT应用的巨大潜力。
{"title":"Feasibility of applying computerized adaptive testing to the Clinical Medical Science Comprehensive Examination in Korea: a psychometric study.","authors":"Jeongwook Choi, Sung-Soo Jung, Eun Kwang Choi, Kyung Sik Kim, Dong Gi Seo","doi":"10.3352/jeehp.2025.22.29","DOIUrl":"10.3352/jeehp.2025.22.29","url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to investigate the feasibility of transitioning the Clinical Medical Science Comprehensive Examination (CMSCE) to computerized adaptive testing (CAT) in Korea, thereby providing greater opportunities for medical students to accurately compare their clinical competencies with peers nationwide and to monitor their own progress.</p><p><strong>Methods: </strong>A medical self-assessment using CAT was conducted from March to June 2023, involving 1,541 medical students who volunteered from 40 medical colleges in Korea. An item bank consisting of 1,145 items from previously administered CMSCE examinations (2019-2021) hosted by the Medical Education Assessment Corporation was established. Items were selected through 2-stage filtering, based on classical test theory (discrimination index above 0.15) and item response theory (discrimination parameter estimates above 0.6 and difficulty parameter estimates between -5 and +5). Maximum Fisher information was employed as the item selection method, and maximum likelihood estimation was used for ability estimation.</p><p><strong>Results: </strong>The CAT was successfully administered without significant issues. The stopping rule was set at a standard error of measurement of 0.25, with a maximum of 50 items for ability estimation. The mean ability score was 0.55, with an average of 28 items administered per student. Students at extreme ability levels reached the maximum of 50 items due to the limited availability of items at appropriate difficulty levels.</p><p><strong>Conclusion: </strong>The medical self-assessment CAT, the first of its kind in Korea, was successfully implemented nationwide without significant problems. These results indicate strong potential for expanding the use of CAT in medical education assessments.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"29"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Educational Evaluation for Health Professions
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1