Purpose: The primary aim of this study is to validate the Blended Learning Usability Evaluation - Questionnaire (BLUE-Q) for use in the field of health professions education through a Bayesian approach. As Bayesian questionnaire validation remains elusive, a secondary aim of this article is to serve as a simplified tutorial for engaging in such validation practices in health professions education.
Methods: A total of 10 health education-based experts in blended learning were recruited to participate in a 30-minute interviewer-administered survey. On a 5-point Likert scale, experts rated how well they perceived each item of the BLUE-Q to reflect its underlying usability domain (i.e., effectiveness, efficiency, satisfaction, accessibility, organization, and learner experience). Ratings were descriptively analyzed and converted into beta prior distributions. Participants were also given the option to provide qualitative comments for each item.
Results: After reviewing the computed expert prior distributions, 31 quantitative items were identified as having a probability of 'low endorsement' and were thus removed from the questionnaire. Additionally, qualitative comments were used to revise the phrasing and order of items to ensure clarity and logical flow. The BLUE-Q's final version comprises 23 Likert-scale items and 6 open-ended items.
Conclusion: Questionnaire validation can generally be a complex, time-consuming, and costly process, inhibiting many from engaging in proper validation practices. In this study, we demonstrate that a Bayesian questionnaire validation approach can be a simple, resource-efficient, yet rigorous solution to validating a tool for content and item-domain correlation through the elicitation of domain expert endorsement ratings.
This study examines the legality and appropriateness of keeping the multiple-choice question items of the Korean Medical Licensing Examination (KMLE) confidential. Through an analysis of cases from the United States, Canada, and Australia, where medical licensing exams are conducted using item banks and computer-based testing, we found that exam items are kept confidential to ensure fairness and prevent cheating. In Korea, the Korea Health Personnel Licensing Examination Institute (KHPLEI) has been disclosing KMLE questions despite concerns over exam integrity. Korean courts have consistently ruled that multiple-choice question items prepared by public institutions are non-public information under Article 9(1)(v) of the Korea Official Information Disclosure Act (KOIDA), which exempts disclosure if it significantly hinders the fairness of exams or research and development. The Constitutional Court of Korea has upheld this provision. Given the time and cost involved in developing high-quality items and the need to accurately assess examinees' abilities, there are compelling reasons to keep KMLE items confidential. As a public institution responsible for selecting qualified medical practitioners, KHPLEI should establish its disclosure policy based on a balanced assessment of public interest, without influence from specific groups. We conclude that KMLE questions qualify as non-public information under KOIDA, and KHPLEI may choose to maintain their confidentiality to ensure exam fairness and efficiency.
Purpose: This study evaluates the use of ChatGPT-4o in creating tailored continuing professional development (CPD) plans for radiography students, addressing the challenge of aligning CPD with Medical Radiation Practice Board of Australia (MRPBA) requirements. We hypothesized that ChatGPT-4o could support students in CPD planning while meeting regulatory standards.
Methods: A descriptive, experimental design was used to generate 3 unique CPD plans using ChatGPT-4o, each tailored to hypothetical graduate radiographers in varied clinical settings. Each plan followed MRPBA guidelines, focusing on computed tomography specialization by the second year. Three MRPBA-registered academics assessed the plans using criteria of appropriateness, timeliness, relevance, reflection, and completeness from October 2024 to November 2024. Ratings underwent analysis using the Friedman test and intraclass correlation coefficient (ICC) to measure consistency among evaluators.
Results: ChatGPT-4o generated CPD plans generally adhered to regulatory standards across scenarios. The Friedman test indicated no significant differences among raters (P=0.420, 0.761, and 0.807 for each scenario), suggesting consistent scores within scenarios. However, ICC values were low (-0.96, 0.41, and 0.058 for scenarios 1, 2, and 3), revealing variability among raters, particularly in timeliness and completeness criteria, suggesting limitations in the ChatGPT-4o's ability to address individualized and context-specific needs.
Conclusion: ChatGPT-4o demonstrates the potential to ease the cognitive demands of CPD planning, offering structured support in CPD development. However, human oversight remains essential to ensure plans are contextually relevant and deeply reflective. Future research should focus on enhancing artificial intelligence's personalization for CPD evaluation, highlighting ChatGPT-4o's potential and limitations as a tool in professional education.
Background: In the Iranian context, no 360-degree evaluation tool has been developed to assess the performance of prehospital medical emergency students in clinical settings. This article describes the development of a 360-degree evaluation tool and presents its first psychometric evaluation.
Methods: There were 2 steps in this study: step 1 involved developing the instrument (i.e., generating the items) and step 2 constituted the psychometric evaluation of the instrument. We performed exploratory and confirmatory factor analyses and also evaluated the instrument’s face, content, and convergent validity and reliability.
Results: The instrument contains 55 items across 6 domains, including leadership, management, and teamwork (19 items), consciousness and responsiveness (14 items), clinical and interpersonal communication skills (8 items), integrity (7 items), knowledge and accountability (4 items), and loyalty and transparency (3 items). The instrument was confirmed to be a valid measure, as the 6 domains had eigenvalues over Kaiser’s criterion of 1 and in combination explained 60.1% of the variance (Bartlett’s test of sphericity [1,485]=19,867.99, P<0.01). Furthermore, this study provided evidence for the instrument’s convergent validity and internal consistency (α=0.98), suggesting its suitability for assessing student performance.
Conclusion: We found good evidence for the validity and reliability of the instrument. Our instrument can be used to make future evaluations of student performance in the clinical setting more structured, transparent, informative, and comparable.
Purpose: The duties of paramedics and emergency medical technicians (P&EMTs) are continuously changing due to developments in medical systems. This study presents evaluation goals for P&EMTs by analyzing their work, especially the tasks that new P&EMTs (with less than 3 years’ experience) find difficult, to foster the training of P&EMTs who could adapt to emergency situations after graduation.
Methods: A questionnaire was created based on prior job analyses of P&EMTs. The survey questions were reviewed through focus group interviews, from which 253 task elements were derived. A survey was conducted from July 10, 2023 to October 13, 2023 on the frequency, importance, and difficulty of the 6 occupations in which P&EMTs were employed.
Results: The P&EMTs’ most common tasks involved obtaining patients’ medical histories and measuring vital signs, whereas the most important task was cardiopulmonary resuscitation (CPR). The task elements that the P&EMTs found most difficult were newborn delivery and infant CPR. New paramedics reported that treating patients with fractures, poisoning, and childhood fever was difficult, while new EMTs reported that they had difficulty keeping diaries, managing ambulances, and controlling infection.
Conclusion: Communication was the most important item for P&EMTs, whereas CPR was the most important skill. It is important for P&EMTs to have knowledge of all tasks; however, they also need to master frequently performed tasks and those that pose difficulties in the field. By deriving goals for evaluating P&EMTs, changes could be made to their education, thereby making it possible to train more capable P&EMTs.
Purpose: Faculty development (FD) is important to support teaching, including for clinical teachers. Faculty of Medicine Universitas Indonesia (FMUI) has conducted a clinical teacher training program developed by the medical education department since 2008, both for FMUI teachers and for those at other centers in Indonesia. However, participation is often challenging due to clinical, administrative, and research obligations. The coronavirus disease 2019 pandemic amplified the urge to transform this program. This study aimed to redesign and evaluate an FD program for clinical teachers that focuses on their needs and current situation.
Methods: A 5-step design thinking framework (empathizing, defining, ideating, prototyping, and testing) was used with a pre/post-test design. Design thinking made it possible to develop a participant-focused program, while the pre/post-test design enabled an assessment of the program’s effectiveness.
Results: Seven medical educationalists and 4 senior and 4 junior clinical teachers participated in a group discussion in the empathize phase of design thinking. The research team formed a prototype of a 3-day blended learning course, with an asynchronous component using the Moodle learning management system and a synchronous component using the Zoom platform. Pre-post-testing was done in 2 rounds, with 107 and 330 participants, respectively. Evaluations of the first round provided feedback for improving the prototype for the second round.
Conclusion: Design thinking enabled an innovative-creative process of redesigning FD that emphasized participants’ needs. The pre/ post-testing showed that the program was effective. Combining asynchronous and synchronous learning expands access and increases flexibility. This approach could also apply to other FD programs.
Purpose: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
Methods: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Results: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/ fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
Conclusion: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
Purpose: The coronavirus disease 2019 (COVID-19) pandemic limited healthcare professional education and training opportunities in rural communities. Because the US Department of Veterans Affairs (VA) has robust programs to train clinicians in the United States, this study examined VA trainee perspectives regarding pandemic-related training in rural and urban areas and interest in future employment with the VA.
Methods: Survey responses were collected nationally from VA physicians and nursing trainees before and after COVID-19 (2018 to 2021). Logistic regression models were used to test the association between pandemic timing (pre-pandemic or pandemic), trainee program (physician or nurse), and the interaction of trainee pandemic timing and program on VA trainee satisfaction and trainee likelihood to consider future VA employment in rural and urban areas.
Results: While physician trainees at urban facilities reported decreases in overall training satisfaction and corresponding decreases in the likelihood of considering future VA employment from pre-pandemic to pandemic, rural physician trainees showed no changes in either outcome. In contrast, while nursing trainees at both urban and rural sites had decreases in training satisfaction associated with the pandemic, there was no corresponding effect on the likelihood of future employment by nurses at either urban or rural VA sites.
Conclusion: The study’s findings suggest differences in the training experiences of physicians and nurses at rural sites, as well as between physician trainees at urban and rural sites. Understanding these nuances can inform the development of targeted approaches to address the ongoing provider shortages that rural communities in the United States are facing.
Purpose: This study aimed to identify challenges and potential improvements in Korea’s medical education accreditation process according to the Accreditation Standards of the Korean Institute of Medical Education and Evaluation 2019 (ASK2019). Meta-evaluation was conducted to survey the experiences and perceptions of stakeholders, including self-assessment committee members, site visit committee members, administrative staff, and medical school professors.
Methods: A cross-sectional study was conducted using surveys sent to 40 medical schools. The 332 participants included self-assessment committee members, site visit team members, administrative staff, and medical school professors. The t-test, one-way analysis of variance and the chi-square test were used to analyze and compare opinions on medical education accreditation between the categories of participants.
Results: Site visit committee members placed greater importance on the necessity of accreditation than faculty members. A shared positive view on accreditation’s role in improving educational quality was seen among self-evaluation committee members and professors. Administrative staff highly regarded the Korean Institute of Medical Education and Evaluation’s reliability and objectivity, unlike the self-evaluation committee members. Site visit committee members positively perceived the clarity of accreditation standards, differing from self-assessment committee members. Administrative staff were most optimistic about implementing standards. However, the accreditation process encountered challenges, especially in duplicating content and preparing self-evaluation reports. Finally, perceptions regarding the accuracy of final site visit reports varied significantly between the self-evaluation committee members and the site visit committee members.
Conclusion: This study revealed diverse views on medical education accreditation, highlighting the need for improved communication, expectation alignment, and stakeholder collaboration to refine the accreditation process and quality.