首页 > 最新文献

Journal of Educational Evaluation for Health Professions最新文献

英文 中文
Strategies for remediating clinical reasoning skill deficits in underperforming residents: a scoping review. 治疗表现不佳的住院医师临床推理技能缺陷的策略:范围审查。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-02-05 DOI: 10.3352/jeehp.2026.23.3
Jovian Philip Swatan, Fithriyah Cholifatul Ummah, Cecilia Felicia Chandra, Nooreen Adnan

Clinical reasoning is a core competency in medical practice; however, deficits in this domain among residents are often difficult to identify and remediate because of its cognitive complexity and the absence of standardized assessment approaches. This scoping review aimed to map and analyze existing evidence on strategies to remediate clinical reasoning skill deficits in underperforming medical residents. Using the Arksey and O'Malley framework as refined by Levac and his colleagues, and reported in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we systematically searched PubMed, Scopus, MEDLINE, Web of Science, SpringerLink, ProQuest, and EBSCOhost for studies published between 2000 and 2024. Definitions of clinical reasoning, underperformance, and remediation were adopted from prior literature. Twenty studies met the inclusion criteria, comprising original research and literature reviews in multiple medical specialties. Methods for identifying clinical reasoning deficits included written, oral, and performance-based assessments, as well as routine workplace-based evaluations. Remediation strategies ranged from structured institutional programs to individualized, case-specific interventions, with coaching, deliberate practice, guided reflection, and structured thinking frameworks frequently employed. Two studies reported positive outcomes following completion of remediation for clinical reasoning deficits. Key enablers included psychological safety, learner engagement, and accessible faculty support, whereas barriers included learner resistance, inadequate baseline knowledge, faculty skill limitations, and institutional resource constraints. Effective remediation requires early identification, comprehensive diagnostic assessment, and tailored, coaching-based interventions supported by institutional commitment. Nonetheless, substantial variability in definitions, remediation protocols, and evaluation methods highlights the need for greater standardization and further research across diverse contexts to inform evidence-based frameworks for clinical reasoning remediation.

临床推理是医疗实践的核心能力;然而,由于认知的复杂性和缺乏标准化的评估方法,居民在这一领域的缺陷往往难以识别和补救。这个范围审查的目的是绘制和分析现有的证据策略,以纠正临床推理技能缺陷在表现不佳的住院医生。使用由Levac和他的同事完善的Arksey和O'Malley框架,并根据PRISMA-ScR(系统评价和荟萃分析扩展范围评价的首选报告项目)指南进行报告,我们系统地检索了PubMed, Scopus, MEDLINE, Web of Science, SpringerLink, ProQuest和EBSCOhost,以检索2000年至2024年间发表的研究。临床推理、表现不佳和补救的定义采用了先前的文献。20项研究符合纳入标准,包括多个医学专业的原始研究和文献综述。识别临床推理缺陷的方法包括书面、口头和基于绩效的评估,以及常规的基于工作场所的评估。补救策略包括从结构化的机构计划到个性化的个案干预,以及经常采用的指导、刻意练习、引导反思和结构化思维框架。两项研究报告了临床推理缺陷修复完成后的积极结果。关键的促进因素包括心理安全、学习者参与和无障碍的教师支持,而障碍包括学习者抵制、基线知识不足、教师技能限制和机构资源约束。有效的补救措施需要早期识别、全面的诊断评估以及由机构承诺支持的量身定制的、基于指导的干预措施。尽管如此,在定义、补救方案和评估方法上的巨大差异突出了需要在不同背景下进行更大的标准化和进一步研究,以告知临床推理补救的循证框架。
{"title":"Strategies for remediating clinical reasoning skill deficits in underperforming residents: a scoping review.","authors":"Jovian Philip Swatan, Fithriyah Cholifatul Ummah, Cecilia Felicia Chandra, Nooreen Adnan","doi":"10.3352/jeehp.2026.23.3","DOIUrl":"https://doi.org/10.3352/jeehp.2026.23.3","url":null,"abstract":"<p><p>Clinical reasoning is a core competency in medical practice; however, deficits in this domain among residents are often difficult to identify and remediate because of its cognitive complexity and the absence of standardized assessment approaches. This scoping review aimed to map and analyze existing evidence on strategies to remediate clinical reasoning skill deficits in underperforming medical residents. Using the Arksey and O'Malley framework as refined by Levac and his colleagues, and reported in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we systematically searched PubMed, Scopus, MEDLINE, Web of Science, SpringerLink, ProQuest, and EBSCOhost for studies published between 2000 and 2024. Definitions of clinical reasoning, underperformance, and remediation were adopted from prior literature. Twenty studies met the inclusion criteria, comprising original research and literature reviews in multiple medical specialties. Methods for identifying clinical reasoning deficits included written, oral, and performance-based assessments, as well as routine workplace-based evaluations. Remediation strategies ranged from structured institutional programs to individualized, case-specific interventions, with coaching, deliberate practice, guided reflection, and structured thinking frameworks frequently employed. Two studies reported positive outcomes following completion of remediation for clinical reasoning deficits. Key enablers included psychological safety, learner engagement, and accessible faculty support, whereas barriers included learner resistance, inadequate baseline knowledge, faculty skill limitations, and institutional resource constraints. Effective remediation requires early identification, comprehensive diagnostic assessment, and tailored, coaching-based interventions supported by institutional commitment. Nonetheless, substantial variability in definitions, remediation protocols, and evaluation methods highlights the need for greater standardization and further research across diverse contexts to inform evidence-based frameworks for clinical reasoning remediation.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"3"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146158767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Presidential address 2026: celebrating academic excellence and expanding computer-based testing across health professions. 2026年总统演讲:庆祝学术卓越,在卫生专业领域扩大计算机测试。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-01-09 DOI: 10.3352/jeehp.2026.23.1
Hyunjoo Pai
{"title":"Presidential address 2026: celebrating academic excellence and expanding computer-based testing across health professions.","authors":"Hyunjoo Pai","doi":"10.3352/jeehp.2026.23.1","DOIUrl":"10.3352/jeehp.2026.23.1","url":null,"abstract":"","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"1"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976626/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of reference management software with new artificial intelligence-based tools. 参考管理软件与基于人工智能的新工具的比较。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-01-15 DOI: 10.3352/jeehp.2026.23.2
Jae Gyeong Jin, Seung Gyu Lee, Jea Hyeun Park, Jang Won Han, Jae Young Kim, Jungirl Seok, Jeong-Ju Yoo

Reference management software (RMS) represents a cornerstone of modern academic writing and publishing. For decades, programs such as EndNote, Zotero, and Mendeley have played central roles in facilitating citation organization, bibliography formatting, and collaborative scholarship. Although each platform has introduced unique innovations, persistent limitations remain, particularly with respect to usability, accessibility, and accuracy. In parallel, the rise of generative artificial intelligence has introduced an unprecedented challenge: the inadvertent inclusion of fabricated or incorrect references mistakenly incorporated into manuscripts. This phenomenon has exposed a critical limitation of traditional RMS platforms, namely their inability to verify reference authenticity. Against this backdrop, new solutions have emerged. One such example is CiteWell (https://citewell.org/), an artificial intelligence (AI)-era RMS that introduces several notable innovations, including PubMed-integrated verification, an intuitive interface for new users, customizable journal-specific styles, and multilingual accessibility. This review provides a comprehensive historical overview of RMS, evaluates the strengths and weaknesses of major platforms, and positions emerging AI-based tools as a new paradigm that combines traditional reference management with essential safeguards for contemporary academic challenges.

参考文献管理软件(RMS)是现代学术写作和出版的基石。几十年来,EndNote、Zotero和Mendeley等程序在促进引文组织、书目格式和协作奖学金方面发挥了核心作用。尽管每个平台都引入了独特的创新,但仍然存在持续的限制,特别是在可用性、可访问性和准确性方面。与此同时,生成式人工智能的兴起带来了前所未有的挑战:无意中包含了错误地纳入手稿的捏造或不正确的参考文献。这种现象暴露了传统RMS平台的一个关键局限性,即无法验证参考真实性。在此背景下,出现了新的解决办法。其中一个例子是CiteWell (https://citewell.org/),这是一个人工智能(AI)时代的RMS,它引入了几个值得注意的创新,包括pubmed集成验证,为新用户提供的直观界面,可定制的期刊特定样式以及多语言可访问性。本综述提供了RMS的全面历史概述,评估了主要平台的优势和劣势,并将新兴的基于人工智能的工具定位为将传统参考管理与当代学术挑战的基本保障相结合的新范式。
{"title":"Comparison of reference management software with new artificial intelligence-based tools.","authors":"Jae Gyeong Jin, Seung Gyu Lee, Jea Hyeun Park, Jang Won Han, Jae Young Kim, Jungirl Seok, Jeong-Ju Yoo","doi":"10.3352/jeehp.2026.23.2","DOIUrl":"10.3352/jeehp.2026.23.2","url":null,"abstract":"<p><p>Reference management software (RMS) represents a cornerstone of modern academic writing and publishing. For decades, programs such as EndNote, Zotero, and Mendeley have played central roles in facilitating citation organization, bibliography formatting, and collaborative scholarship. Although each platform has introduced unique innovations, persistent limitations remain, particularly with respect to usability, accessibility, and accuracy. In parallel, the rise of generative artificial intelligence has introduced an unprecedented challenge: the inadvertent inclusion of fabricated or incorrect references mistakenly incorporated into manuscripts. This phenomenon has exposed a critical limitation of traditional RMS platforms, namely their inability to verify reference authenticity. Against this backdrop, new solutions have emerged. One such example is CiteWell (https://citewell.org/), an artificial intelligence (AI)-era RMS that introduces several notable innovations, including PubMed-integrated verification, an intuitive interface for new users, customizable journal-specific styles, and multilingual accessibility. This review provides a comprehensive historical overview of RMS, evaluates the strengths and weaknesses of major platforms, and positions emerging AI-based tools as a new paradigm that combines traditional reference management with essential safeguards for contemporary academic challenges.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"2"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976740/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of artificial intelligence in the 2025 medical parasitology course at Hallym University. 人工智能在翰林大学2025年医学寄生虫学课程中的应用。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2026-01-01 Epub Date: 2026-02-05 DOI: 10.3352/jeehp.2026.23.4
Eun Hee Ha
{"title":"Implementation of artificial intelligence in the 2025 medical parasitology course at Hallym University.","authors":"Eun Hee Ha","doi":"10.3352/jeehp.2026.23.4","DOIUrl":"10.3352/jeehp.2026.23.4","url":null,"abstract":"","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"23 ","pages":"4"},"PeriodicalIF":3.7,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976625/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146158799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accuracy of ChatGPT in answering cardiology board-style questions. ChatGPT在回答心脏病学板式问题中的准确性。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-02-27 DOI: 10.3352/jeehp.2025.22.9
Albert Andrew
{"title":"Accuracy of ChatGPT in answering cardiology board-style questions.","authors":"Albert Andrew","doi":"10.3352/jeehp.2025.22.9","DOIUrl":"10.3352/jeehp.2025.22.9","url":null,"abstract":"","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"9"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12042102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143517011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The impact of artificial intelligence-driven simulation on the development of non-technical skills in medical education: a systematic review. 人工智能驱动的模拟对医学教育中非技术技能发展的影响:系统综述。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-11-24 DOI: 10.3352/jeehp.2025.22.37
Sana Loubbairi, Yasmine El Moussaoui, Laila Lahlou, Imad Chakri, Hicham Nassik

Purpose: Artificial intelligence (AI)-driven simulation is an emerging approach in healthcare education that enhances learning effectiveness. This review examined its impact on the development of non-technical skills among medical learners.

Methods: Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a systematic review was conducted using the following databases: Web of Science, ScienceDirect, Scopus, and PubMed. The quality of the included studies was assessed using the Mixed.

Methods: Appraisal Tool. The protocol was previously registered in PROSPERO (CRD420251038024).

Results: Of the 1,442 studies identified in the initial search, 20 met the inclusion criteria, involving 2,535 participants. The simulators varied considerably, ranging from platforms built on symbolic AI methods to social robots powered by computational AI. Among the 15 AI-driven simulators, 10 used ChatGPT or its variants as virtual patients. Several studies evaluated multiple non-technical skills simultaneously. Communication and clinical reasoning were the most frequently assessed skills, appearing in 12 and 6 studies, respectively, which generally reported positive outcomes. Improvements were also noted in decision-making, empathy, self-confidence, critical thinking, and problem-solving. In contrast, emotional regulation, assessed in a single study, showed no significant difference. Notably, none of the studies examined reflection, reflective practice, teamwork, or leadership.

Conclusion: AI-driven simulation shows substantial potential for enhancing non-technical skills in medical education, particularly communication and clinical reasoning. However, its effects on several other non-technical skills remain unclear. Given heterogeneity in study designs and outcome measures, these findings should be interpreted cautiously. These considerations highlight the need for further research to support integrating this innovative approach into medical curricula.

目的:人工智能(AI)驱动的模拟是一种新兴的医疗保健教育方法,可以提高学习效率。本综述探讨了其对医学学习者非技术技能发展的影响。方法:遵循PRISMA(系统评价和荟萃分析的首选报告项目)指南,使用以下数据库进行系统评价:Web of Science、ScienceDirect、Scopus和PubMed。纳入研究的质量用Mixed进行评估。方法:评价工具。该协议先前在PROSPERO (CRD420251038024)中注册。结果:在最初的检索中确定的1442项研究中,有20项符合纳入标准,涉及2,535名参与者。模拟器变化很大,从基于符号人工智能方法的平台到由计算人工智能驱动的社交机器人。在15个人工智能模拟器中,有10个使用ChatGPT或其变体作为虚拟患者。一些研究同时评估了多种非技术技能。沟通和临床推理是最常被评估的技能,分别出现在12项和6项研究中,这些研究通常报告了积极的结果。在决策、同理心、自信、批判性思维和解决问题方面也有所改善。相比之下,在一项研究中评估的情绪调节没有显示出显著差异。值得注意的是,这些研究都没有考察反思、反思实践、团队合作或领导力。结论:人工智能驱动的模拟显示出增强医学教育非技术技能的巨大潜力,特别是沟通和临床推理。然而,它对其他一些非技术技能的影响尚不清楚。考虑到研究设计和结果测量的异质性,这些发现应谨慎解释。这些考虑突出表明需要进一步研究以支持将这一创新方法纳入医学课程。
{"title":"The impact of artificial intelligence-driven simulation on the development of non-technical skills in medical education: a systematic review.","authors":"Sana Loubbairi, Yasmine El Moussaoui, Laila Lahlou, Imad Chakri, Hicham Nassik","doi":"10.3352/jeehp.2025.22.37","DOIUrl":"https://doi.org/10.3352/jeehp.2025.22.37","url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI)-driven simulation is an emerging approach in healthcare education that enhances learning effectiveness. This review examined its impact on the development of non-technical skills among medical learners.</p><p><strong>Methods: </strong>Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a systematic review was conducted using the following databases: Web of Science, ScienceDirect, Scopus, and PubMed. The quality of the included studies was assessed using the Mixed.</p><p><strong>Methods: </strong>Appraisal Tool. The protocol was previously registered in PROSPERO (CRD420251038024).</p><p><strong>Results: </strong>Of the 1,442 studies identified in the initial search, 20 met the inclusion criteria, involving 2,535 participants. The simulators varied considerably, ranging from platforms built on symbolic AI methods to social robots powered by computational AI. Among the 15 AI-driven simulators, 10 used ChatGPT or its variants as virtual patients. Several studies evaluated multiple non-technical skills simultaneously. Communication and clinical reasoning were the most frequently assessed skills, appearing in 12 and 6 studies, respectively, which generally reported positive outcomes. Improvements were also noted in decision-making, empathy, self-confidence, critical thinking, and problem-solving. In contrast, emotional regulation, assessed in a single study, showed no significant difference. Notably, none of the studies examined reflection, reflective practice, teamwork, or leadership.</p><p><strong>Conclusion: </strong>AI-driven simulation shows substantial potential for enhancing non-technical skills in medical education, particularly communication and clinical reasoning. However, its effects on several other non-technical skills remain unclear. Given heterogeneity in study designs and outcome measures, these findings should be interpreted cautiously. These considerations highlight the need for further research to support integrating this innovative approach into medical curricula.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"37"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146020145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study. 大型语言模型在泰国国家医疗执照考试中的表现:一项横断面研究。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-05-12 DOI: 10.3352/jeehp.2025.22.16
Prut Saowaprut, Romen Samuel Wabina, Junwei Yang, Lertboon Siriwat

Purpose: This study aimed to evaluate the feasibility of general-purpose large language models (LLMs) in addressing inequities in medical licensure exam preparation for Thailand’s National Medical Licensing Examination (ThaiNLE), which currently lacks standardized public study materials.

Methods: We assessed 4 multi-modal LLMs (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro) using a 304-question ThaiNLE Step 1 mock examination (10.2% image-based), applying deterministic API configurations and 5 inference repetitions per model. Performance was measured via micro- and macro-accuracy metrics compared against historical passing thresholds.

Results: All models exceeded passing scores, with GPT-4 achieving the highest accuracy (88.9%; 95% confidence interval, 88.7–89.1), surpassing Thailand’s national average by more than 2 standard deviations. Claude 3.5 Sonnet (80.1%) and Gemini 1.5 Pro (72.8%) followed hierarchically. Models demonstrated robustness across 17 of 20 medical domains, but variability was noted in genetics (74.0%) and cardiovascular topics (58.3%). While models demonstrated proficiency with images (Gemini 1.0 Pro: +9.9% vs. text), text-only accuracy remained superior (GPT4o: 90.0% vs. 82.6%).

Conclusion: General-purpose LLMs show promise as equitable preparatory tools for ThaiNLE Step 1. However, domain-specific knowledge gaps and inconsistent multi-modal integration warrant refinement before clinical deployment.

目的:本研究旨在评估通用大型语言模型(llm)在解决泰国国家医疗执照考试(ThaiNLE)的医疗执照考试准备不公平问题方面的可行性,该考试目前缺乏标准化的公共学习材料。方法:我们使用304个问题的ThaiNLE Step 1模拟考试(10.2%基于图像)评估了4个多模态llm (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro),应用确定性API配置和每个模型5次推理重复。性能通过与历史通过阈值进行比较的微观和宏观精度度量来衡量。结果:所有模型均超过及格分数,其中GPT-4准确率最高(88.9%;95%置信区间为88.7-89.1),超过泰国全国平均水平2个标准差以上。克劳德3.5十四行诗(80.1%)和双子座1.5 Pro(72.8%)紧随其后。模型在20个医学领域中的17个领域表现出稳健性,但在遗传学(74.0%)和心血管主题(58.3%)方面存在差异。虽然模型显示了对图像的熟练程度(Gemini 1.0 Pro: +9.9% vs.文本),但纯文本的准确率仍然更高(gpt - 40: 90.0% vs. 82.6%)。结论:通用法学硕士有望成为ThaiNLE第一步的公平准备工具。然而,领域特定的知识差距和不一致的多模式集成需要在临床部署之前进行改进。
{"title":"Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study.","authors":"Prut Saowaprut, Romen Samuel Wabina, Junwei Yang, Lertboon Siriwat","doi":"10.3352/jeehp.2025.22.16","DOIUrl":"10.3352/jeehp.2025.22.16","url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to evaluate the feasibility of general-purpose large language models (LLMs) in addressing inequities in medical licensure exam preparation for Thailand’s National Medical Licensing Examination (ThaiNLE), which currently lacks standardized public study materials.</p><p><strong>Methods: </strong>We assessed 4 multi-modal LLMs (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro) using a 304-question ThaiNLE Step 1 mock examination (10.2% image-based), applying deterministic API configurations and 5 inference repetitions per model. Performance was measured via micro- and macro-accuracy metrics compared against historical passing thresholds.</p><p><strong>Results: </strong>All models exceeded passing scores, with GPT-4 achieving the highest accuracy (88.9%; 95% confidence interval, 88.7–89.1), surpassing Thailand’s national average by more than 2 standard deviations. Claude 3.5 Sonnet (80.1%) and Gemini 1.5 Pro (72.8%) followed hierarchically. Models demonstrated robustness across 17 of 20 medical domains, but variability was noted in genetics (74.0%) and cardiovascular topics (58.3%). While models demonstrated proficiency with images (Gemini 1.0 Pro: +9.9% vs. text), text-only accuracy remained superior (GPT4o: 90.0% vs. 82.6%).</p><p><strong>Conclusion: </strong>General-purpose LLMs show promise as equitable preparatory tools for ThaiNLE Step 1. However, domain-specific knowledge gaps and inconsistent multi-modal integration warrant refinement before clinical deployment.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"16"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143986836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The role of large language models in the peer-review process: opportunities and challenges for medical journal reviewers and editors. 大型语言模型在同行评审过程中的作用:医学期刊审稿人和编辑的机遇和挑战。
IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-01-16 DOI: 10.3352/jeehp.2025.22.4
Jisoo Lee, Jieun Lee, Jeong-Ju Yoo

The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.

同行评议过程确保了科学研究的完整性。这在医学领域尤其重要,因为研究结果直接影响到病人的护理。然而,出版物的快速增长使审稿人感到紧张,导致延迟和潜在的质量下降。生成式人工智能,尤其是像ChatGPT这样的大型语言模型(llm),可以帮助研究人员进行高效、高质量的评论。这篇综述探讨了法学硕士与同行评审的整合,强调了他们在语言任务中的优势和评估科学有效性的挑战,特别是在临床医学中。集成的关键点包括初始筛选、审稿人匹配、反馈支持和语言审查。然而,为这些目的实施法学硕士将需要解决偏见、隐私问题和数据机密性问题。我们建议在明确的指导方针下使用法学硕士作为补充工具,以支持而不是取代维护严格的同行评审标准的人类专业知识。
{"title":"The role of large language models in the peer-review process: opportunities and challenges for medical journal reviewers and editors.","authors":"Jisoo Lee, Jieun Lee, Jeong-Ju Yoo","doi":"10.3352/jeehp.2025.22.4","DOIUrl":"10.3352/jeehp.2025.22.4","url":null,"abstract":"<p><p>The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"4"},"PeriodicalIF":9.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11952698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143693856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effectiveness of interprofessional education enhanced by live consultation observations for healthcare students and new professionals in Singapore: a retrospective cross-sectional study. 通过对新加坡卫生保健学生和新专业人员的现场咨询观察提高跨专业教育的有效性:一项回顾性横断面研究。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-08-21 DOI: 10.3352/jeehp.2025.22.21
Lynette Mei Lim Goh, Wai Leong Chiu, Sky Wei Chee Koh

This study aims to evaluate whether incorporating live consultation observations into interprofessional education (IPE) improves learning evaluation scores among healthcare professionals and students. A retrospective cross-sectional analysis was conducted using evaluation data from AHP IPE sessions held from January 2020 to December 2023 across 7 primary care clinics in Singapore. Evaluation scores were compared between sessions with facilitated discussions only (n=667) and sessions with additional live consultation observations (n=501). Logistic regression was used to analyze factors associated with perfect evaluation scores. Sessions that included live consultations were significantly more likely to achieve perfect evaluation scores (odds ratio [OR], 1.68; 95% confidence interval [CI], 1.27-2.22). Nursing/care coordinator and allied health professions (OR 2.07 and 1.76 respectively) were significantly more likely to give perfect scores compared to medical professions. Healthcare professionals were also more likely to give perfect scores than students (OR, 1.52; 95% CI,1.08-2.14), indicating enhanced perceived effectiveness. These findings support the use of experiential learning strategies to optimize interprofessional training outcomes.

本研究旨在评估将现场咨询观察纳入跨专业教育(IPE)是否能提高医疗保健专业人员和学生的学习评估分数。利用2020年1月至2023年12月在新加坡7家初级保健诊所举行的AHP IPE会议的评估数据进行了回顾性横断面分析。评估分数在仅进行促进讨论的会议(n=667)和附加现场咨询观察的会议(n=501)之间进行比较。采用Logistic回归分析与完美评价得分相关的因素。包括现场咨询的会议更有可能获得完美的评估分数(优势比[OR], 1.68; 95%可信区间[CI], 1.27-2.22)。护理/护理协调员和联合卫生专业(OR分别为2.07和1.76)比医疗专业更有可能给出完美的分数。医疗保健专业人员也比学生更有可能给出完美的分数(OR, 1.52; 95% CI,1.08-2.14),表明感知有效性增强。这些发现支持使用体验式学习策略来优化跨专业培训结果。
{"title":"Effectiveness of interprofessional education enhanced by live consultation observations for healthcare students and new professionals in Singapore: a retrospective cross-sectional study.","authors":"Lynette Mei Lim Goh, Wai Leong Chiu, Sky Wei Chee Koh","doi":"10.3352/jeehp.2025.22.21","DOIUrl":"10.3352/jeehp.2025.22.21","url":null,"abstract":"<p><p>This study aims to evaluate whether incorporating live consultation observations into interprofessional education (IPE) improves learning evaluation scores among healthcare professionals and students. A retrospective cross-sectional analysis was conducted using evaluation data from AHP IPE sessions held from January 2020 to December 2023 across 7 primary care clinics in Singapore. Evaluation scores were compared between sessions with facilitated discussions only (n=667) and sessions with additional live consultation observations (n=501). Logistic regression was used to analyze factors associated with perfect evaluation scores. Sessions that included live consultations were significantly more likely to achieve perfect evaluation scores (odds ratio [OR], 1.68; 95% confidence interval [CI], 1.27-2.22). Nursing/care coordinator and allied health professions (OR 2.07 and 1.76 respectively) were significantly more likely to give perfect scores compared to medical professions. Healthcare professionals were also more likely to give perfect scores than students (OR, 1.52; 95% CI,1.08-2.14), indicating enhanced perceived effectiveness. These findings support the use of experiential learning strategies to optimize interprofessional training outcomes.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"21"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12768549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145534594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and psychometric assessment of a scale for evaluating healthcare professionals' attitudes toward interprofessional education and collaboration in the United States: a cross-sectional study. 美国医疗保健专业人员对跨专业教育和合作态度的量表的开发和心理测量评估:一项横断面研究。
IF 3.7 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2025-01-01 Epub Date: 2025-10-20 DOI: 10.3352/jeehp.2025.22.32
Michael Christopher Banks, Ryan Brock Mutcheson, Maedot Ariaya Haymete, Serkan Toy

Purpose: Interprofessional education (IPE) is increasingly recognized as critical to preparing health professionals for collaborative practice, yet rigorous assessment remains limited by a lack of psychometrically sound instruments. Building on a previously developed questionnaire for physicians, this study aimed to expand the scale to include allied health professionals and to evaluate whether the factor structure remained consistent across professions. We hypothesized that a similar factor structure would emerge from the combined dataset, thereby supporting the scale's generalizability.

Methods: This observational study included 930 healthcare professionals in the United States (379 physicians, 419 nurses, 76 pharmacists, and others) who completed a 35-item questionnaire addressing IPE competency domains. Data were collected between December 2019 and May 2020. Exploratory factor analysis was employed to examine the factor structure, followed by item response theory (IRT) analyses to assess item fit, reliability, and validity. Raw data are available upon request.

Results: Factor analysis of 22 retained items confirmed a 5-factor solution: teamwork and communication, patient-centered care, roles and responsibilities, ethics and attitudes, and reflective practice, explaining 59% of the variance. Subscale reliabilities ranged from α=0.65 to 0.87. IRT analyses supported construct validity and measurement precision, while identifying areas for refinement in reflective practice.

Conclusion: This study demonstrates that the scale is reliable, valid, and generalizable across diverse health professions. It provides a robust tool for assessing attitudes toward IPE, offering value for curriculum evaluation, institutional benchmarking, and future longitudinal research on professional identity formation and collaborative practice.

目的:跨专业教育(IPE)越来越被认为是为卫生专业人员准备合作实践的关键,但严格的评估仍然受到缺乏心理测量学上健全工具的限制。在先前为医生开发的问卷调查的基础上,本研究旨在扩大规模,包括联合卫生专业人员,并评估因素结构是否在各专业之间保持一致。我们假设从合并的数据集中会出现类似的因素结构,从而支持量表的概括性。方法:本观察性研究包括930名美国医疗保健专业人员(379名医生,419名护士,76名药剂师等),他们完成了一份35项关于IPE能力领域的问卷。数据收集于2019年12月至2020年5月。采用探索性因子分析检验因子结构,然后采用项目反应理论(IRT)分析评估项目契合度、信度和效度。原始数据可应要求提供。结果:对22个保留项目进行因子分析,确定了5个因素的解决方案:团队合作与沟通、以患者为中心的护理、角色与责任、道德与态度、反思性实践,解释了59%的方差。分量表信度α=0.65 ~ 0.87。IRT分析支持结构有效性和测量精度,同时确定在反思实践中需要改进的领域。结论:本研究证明该量表在不同的卫生专业中是可靠的、有效的和可推广的。它为评估人们对国际政治经济学的态度提供了一个强有力的工具,为课程评估、机构基准制定以及未来对专业认同形成和合作实践的纵向研究提供了价值。
{"title":"Development and psychometric assessment of a scale for evaluating healthcare professionals' attitudes toward interprofessional education and collaboration in the United States: a cross-sectional study.","authors":"Michael Christopher Banks, Ryan Brock Mutcheson, Maedot Ariaya Haymete, Serkan Toy","doi":"10.3352/jeehp.2025.22.32","DOIUrl":"10.3352/jeehp.2025.22.32","url":null,"abstract":"<p><strong>Purpose: </strong>Interprofessional education (IPE) is increasingly recognized as critical to preparing health professionals for collaborative practice, yet rigorous assessment remains limited by a lack of psychometrically sound instruments. Building on a previously developed questionnaire for physicians, this study aimed to expand the scale to include allied health professionals and to evaluate whether the factor structure remained consistent across professions. We hypothesized that a similar factor structure would emerge from the combined dataset, thereby supporting the scale's generalizability.</p><p><strong>Methods: </strong>This observational study included 930 healthcare professionals in the United States (379 physicians, 419 nurses, 76 pharmacists, and others) who completed a 35-item questionnaire addressing IPE competency domains. Data were collected between December 2019 and May 2020. Exploratory factor analysis was employed to examine the factor structure, followed by item response theory (IRT) analyses to assess item fit, reliability, and validity. Raw data are available upon request.</p><p><strong>Results: </strong>Factor analysis of 22 retained items confirmed a 5-factor solution: teamwork and communication, patient-centered care, roles and responsibilities, ethics and attitudes, and reflective practice, explaining 59% of the variance. Subscale reliabilities ranged from α=0.65 to 0.87. IRT analyses supported construct validity and measurement precision, while identifying areas for refinement in reflective practice.</p><p><strong>Conclusion: </strong>This study demonstrates that the scale is reliable, valid, and generalizable across diverse health professions. It provides a robust tool for assessing attitudes toward IPE, offering value for curriculum evaluation, institutional benchmarking, and future longitudinal research on professional identity formation and collaborative practice.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"22 ","pages":"32"},"PeriodicalIF":3.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12768546/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145330367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Educational Evaluation for Health Professions
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1