Retrieval practice is well-established as a powerful tool for reinforcing long-term learning. Most previous research has concentrated on the effectiveness of overt retrieval, involving recalling information from memory and generating overt responses by writing, typing, or speaking aloud the retrieved information. Here we ask whether covert retrieval, involving mentally retrieving information without producing overt responses, can enhance learning and consolidate long-term memory, and whether it does so as effectively as overt retrieval. The current meta-analysis integrated data from 2560 participants across 18 studies to investigate the magnitude, boundary conditions, and underlying mechanisms of the covert retrieval effect and the relative efficacy of overt and covert retrieval. The results showed that covert retrieval enhances learning to a small but significant extent (g = 0.23), and its effectiveness is moderated by several factors including provision of corrective feedback, control strategy, and retention interval. The results support the additional exposure and desirable difficulty theories to jointly account for the covert retrieval effect. The meta-analysis also found that overt retrieval is more effective than covert retrieval (g = 0.17), with the effect size of this additional benefit being moderated by the mode by which covert retrieval is performed. The results support the truncated search and desirable difficulty explanations of the relative benefit of overt compared to covert retrieval. Overall, the documented findings provide practical implications for optimizing learning and teaching practices and highlight several important directions for future research.
Human movement plays a foundational role in cognition and learning. This topical collection brings together theoretical and empirical work examining how gestures, physical activity, and virtual movement enhance learning in language, multimedia, and activity-based learning. Regarding language learning, interacting with virtual object improves vocabulary learning, especially for learners with low language aptitude. Additionally, emotional narratives support memory more effectively than neutral ones, while instructed gesturing may hinder recall for some learners. In multimedia learning, pointing improves attention and comprehension, whereas tracing can impede learning due to cognitive overload. For activity-based learning, theoretical contributions offer frameworks for integrating movement into learning tasks, emphasizing mechanisms such as generative learning, social cognition, and offloaded processing in areas ranging from digital education to stimming behaviors. Together, these studies offer insights for designing effective, movement-based instruction across diverse learning environments and populations, underscoring the dynamic relationship between bodily action and cognitive development in education.
The generative learning strategy of learning by drawing has received increased attention in recent years. Although this strategy is regularly used by educators, the literature suggests that the effectiveness of the method depends on several factors. In this review, we highlight recent research trends and methodological progress within the field. Although recent developments have led to clearer and more comparable results, many current studies still revealed contradicting results in terms of the efficacy and boundary conditions of the learning strategy. In terms of study designs and the targeted types of knowledge, a trend towards digital drawing studies and a growing variety regarding the content domains of the investigated learning tasks can be observed. Based on the fact that results between different studies are often difficult to compare, we argue to more clearly differentiate visualizations and standardize the terminology for visualizations utilized in learning-oriented drawing tasks. As the properties of visualizations can cause varying cognitive demands on learners, differences in the level of skill required to produce different types of drawings, among other factors and learner variables, can affect the outcomes of this teaching method. Based on our review, we discuss practical and ethical implications as well as considerations for future research.
A growing body of evidence shows that positive student outcomes are associated with racial/ethnic diversity among university STEM instructors. However, few studies to date have been able to provide direct causal evidence identifying the specific mechanism(s) hypothesized to drive the benefits of instructor racial/ethnic diversity. Leaving these mechanisms unexplained may lead both receptive and critical readers to infer that race or ethnicity are somehow “natural” categories that “cause” such outcomes. In this narrative review, we eschew such racial essentialism in favor of an understanding of race as socially constructed, and use an ecological systems perspective to examine how multiple mechanisms of systemic racism operate inside and outside classrooms across multiple levels of analysis. Understanding how these mechanisms relate to each other, and how multiple interconnected mechanisms may drive the benefits of instructor racial/ethnic diversity, could inform the design of policies and practices to disrupt racism and advance equity. By integrating several bodies of psychological and sociological research on systemic racism in STEM and in higher education more broadly, we outline a multi-path model to explain how and under what circumstances STEM instructor racial/ethnic diversity may have particular effects on student experiences or outcomes. We use this model to generate predictions and recommend how researchers could test these predictions in future studies.
In their commentary on our meta-analysis, Zitzmann and Orona (2025) used formal proof and cited methodological studies to argue that test reliability is important, Cronbach’s Alpha generally indicates test reliability, and cutoff values for alpha are indispensable. We agree that high reliability is important for all tests. Yet, alpha does not reflect the reliability of knowledge tests. Zitzmann and Orona’s arguments are based on the unwarranted assumption that knowledge is always homogeneous. Using a concrete example, we show how item interrelatedness (i.e., alpha) can be low for heterogeneous constructs such as knowledge, even when measurement error is minimal (i.e., reliability is high). After a brief discussion of how researchers can heuristically assess construct heterogeneity, we explore alternatives to alpha for evaluating the reliability of knowledge tests. We conclude that abandoning alpha as a reliability index does not compromise the quality of measurement. On the contrary, it is a step toward sounder methodological standards in the measurement of knowledge.

