Background: Generative artificial intelligence tools such as ChatGPT are increasingly used by medical students for self-directed learning. Although these models demonstrate linguistic fluency, their reliability as supplementary resources for preclinical education remains uncertain. In particular, comparisons with evidence-based references such as UpToDate are lacking.
Objective: This study evaluated the similarity between responses generated by ChatGPT (with GPT-4o mini) and those from UpToDate to preclinical medical education questions to assess ChatGPT's potential as an adjunctive learning tool.
Methods: We conducted a cross-sectional comparison study using 150 first-order questions derived from a preclinical question bank at a single allopathic institution under the oversight of a medical educator with more than 25 years of teaching experience. Each question was entered into ChatGPT 10 times in separate chat sessions, and responses from UpToDate were retrieved from the most relevant articles. The responses were preprocessed through lemmatization, stop-word removal, punctuation removal, and numeric normalization. Similarity between ChatGPT and UpToDate responses was quantified using term frequency-inverse document frequency (TF-IDF) cosine similarity. To determine whether the observed similarities exceeded chance, ChatGPT outputs were compared with a null distribution generated from randomized text.
Results: ChatGPT responses demonstrated statistically significant similarity to UpToDate in 59.3% (89/150) of questions. Across subject areas, pharmacology showed the highest concordance (mean cosine similarity 0.338, SD 0.134), followed by pathology (mean 0.321, SD 0.142), biochemistry (mean 0.296, SD 0.120), microbiology (mean 0.297, SD 0.108), and immunology (mean 0.275, SD 0.102). All subject-level similarity scores exceeded those generated from randomized text, confirming that the observed overlap was nonrandom.
Conclusions: ChatGPT with GPT-4o mini exhibited moderate but meaningful alignment with UpToDate across preclinical topics, performing best in fact-based disciplines such as pharmacology. Although it is not a substitute for evidence-based resources, ChatGPT may serve as an accessible adjunctive tool for medical students. Integration into preclinical learning should be coupled with artificial intelligence literacy training to promote responsible use and critical appraisal.
Background: Gender-based violence (GBV) is a public health issue affecting 1 in 3 women globally. Its impact on women's health is challenging, including physical, mental, and social consequences. Health care professionals have a unique opportunity in identifying and supporting GBV survivors, but there is a lack of adequate training.
Objective: This study aims to develop educational resources based on problem-based and experiential learning approaches using virtual reality (VR) scenarios for health sciences students to enhance their skills in addressing GBV.
Methods: A co-creation approach was adopted, encompassing 3 main strategies. First, a focus group was conducted with frontline professionals experienced in GBV. Second, co-creation workshops involved professionals from diverse fields, including higher education pedagogy, gender and public health, nursing and medical education, and immersive technology. Third, expert consultation with frontline professionals ensured coherence between the educational resources and real-world challenges. Following this phase, a first iteration of the materials was piloted with students to assess usability and relevance.
Results: The thematic analysis of the focus group content led to the identification of 9 categories illustrating the competencies and knowledge areas considered relevant to address GBV. As a result of the co-creation workshops, these categories were translated into 18 learning needs, and 4 use cases for the VR component were also identified. The VR scenarios were designed to cover critical GBV situations, fostering transversal skills, such as empathic communication, ethical decision-making, and interdisciplinary collaboration. Two didactic methodologies were proposed for each scenario: a problem-based learning sequence and a single experiential learning session approach, culminating in 4 VR videos and their methodological guides.
Conclusions: The grounding of these educational resources in real-world scenarios, in conjunction with the competencies identified by frontline health and social care professionals with expertise in GBV, ensured alignment with the challenges professionals face in their practice. This helped bridge the gap between theory and practice, offering an innovative approach to GBV education for students of health sciences.
Unlabelled: This viewpoint reflects on how Generation Z (born between 1995 and 2009), shaped by constant digital engagement, a growing awareness of mental health, and a dopamine-driven environment, is transforming medical education and practice. We explore, from a reflective and interdisciplinary perspective, how the defining characteristics of Generation Z, such as their familiarity with technology, demand for emotional safety, and resistance to traditional hierarchies, might reshape the ways we teach, learn, and practice medicine. Drawing on neuroscience, psychology, sociology, and the medical education literature, this viewpoint emphasizes the need to move beyond knowledge transmission and foster self-regulation, critical thinking, and ethical judgment. We call for a deliberate and compassionate adaptation of medical education to cultivate the skills required for a profession increasingly practiced in a context of overstimulation and complexity.
Background: The exponential growth of medical knowledge presents a paradox for modern medical education. While access to information is immediate, applying it in a clinically meaningful way remains a challenge. Large language models (LLMs), such as ChatGPT, are widely used for information retrieval, yet their role in dynamic, high-pressure clinical learning remains poorly understood.
Objective: This study aims to evaluate whether unstructured access to an LLM improves decision-making, teamwork, and confidence in trauma education for medical students.
Methods: This randomized controlled pilot study involved 41 final-year medical students participating in a trauma simulation session. Students self-selected into teams of 4 to 6 and were randomized to either an LLM-assisted group (ChatGPT-4o mini) or a control group without LLM access. All teams completed 18 video-based trauma scenarios requiring time-sensitive clinical decisions. Prompting was unrestricted. Confidence and trauma exposure were assessed using pre- or postquestionnaires. Facilitators rated teamwork (1-5), decision accuracy, and response times. Knowledge retention was measured 4 weeks later via an online quiz.
Results: Confidence in trauma management improved in both groups (P<.001), with larger gains in the non-LLM group (P=.02). LLM support did not enhance the decision accuracy or speed and was associated with longer response times in some complex cases. Teams without LLMs demonstrated more active discussion and scored higher in teamwork ratings (median 5.0 [IQR 5.0-5.0] vs median 3.5 [IQR 3.0-4.5]; P=.08). Students primarily used the LLM for fact-checking but reported vague or overly general responses. Knowledge retention was high across both groups and did not differ significantly (P=.33).
Conclusions: While students appreciated the inclusion of artificial intelligence (AI), unstructured LLM use did not improve performance and may have disrupted the group reasoning. The use of non-English prompting likely contributed to lower AI performance, underscoring the importance of language alignment in LLM applications. This pilot study highlights the need for structured AI integration and targeted instruction in AI literacy. Simulation-based trauma education proved effective and well received, but optimizing the educational value of LLMs will require thoughtful curricular design. Further studies with more students are needed to define best practices for LLM use in clinical education.
Background: Simulation-based education is crucial for training health care professionals in advanced cardiac life support. However, access to high-fidelity in-person simulation is frequently limited by geographic, logistical, and financial constraints. Augmented reality (AR) offers the potential to deliver remote, immersive training experiences that may overcome these barriers, but its effectiveness compared with traditional simulation remains uncertain.
Objective: This study aimed to determine whether remote AR simulation is noninferior to traditional in-person simulation for assessing team leader performance during a ventricular fibrillation cardiac arrest scenario.
Methods: This noninferiority randomized trial enrolled participants at the State University of Campinas (UNICAMP), Brazil, and used cross-continental remote instruction from Stanford University (in the United States) for the AR arm. A total of 50 health care professionals were randomized to either remote AR simulation with a geographically distant instructor (n=25) or traditional in-person simulation (n=25). All participants completed an identical ventricular fibrillation cardiac arrest case as team leaders. Leader performance was assessed using an adapted, validated checklist-based instrument for cognitive leadership and an observational behavioral measure (Behaviorally Anchored Rating Scale). Secondary outcomes included AR participants' evaluations of usability and ergonomics.
Results: A total of 42 participants fully completed the study procedures (remote AR group: n=22; traditional in-person group: n=20). The AR group demonstrated noninferior performance compared to the traditional group across all outcomes. The mean checklist scores were 41.6 (SD 6.2) and 42.6 (SD 5.8) in the remote AR group and traditional in-person group, respectively. The AR group's 95% CI (38.9-44.4) was above the 20% noninferiority threshold of 34.1. Usability and ergonomics were favorably reported by most participants.
Conclusions: Participants in the remote AR simulation demonstrated noninferior team leader decision-making and behavioral performance compared with those in traditional in-person simulation. These findings suggest that remote AR may be a viable strategy to expand access to scenario-based assessment of cardiac arrest leadership, particularly in resource-limited settings. AR participants also reported high usability and low ergonomic burden, indicating comfortable headset use.

