Scaffolding features that provide multimodal support for the pronunciation and meaning of words are increasingly common in digital reading environments. These vocabulary scaffolds are intended to aid the accurate pronunciation and understanding of individual words in context, thus supporting both vocabulary development and comprehension of text. However, the evidence on their efficacy remains inconclusive. The present study adds to the evidence base by examining: 1) whether child characteristics predict the use of vocabulary scaffolds; 2) whether the use of vocabulary scaffolds is associated with reading comprehension performance; and 3) whether the association between the use of scaffolds and reading comprehension is modulated by child and/or item characteristics. A large cohort (N ∼ 120,000) of 5- to 8-year-old children in the United States interacted with a gamified digital reading environment with embedded vocabulary scaffolds, thereby generating a large observational dataset of user log files. Confirmatory analyses with Generalized Linear Mixed Models (GLMMs) indicated that children with lower literacy skills, beginning readers, girls, and bilingual students were more likely to use the scaffold. Overall, the use of scaffolds was associated with better reading comprehension performance. The association between the use of scaffolds and reading comprehension was modulated by both child and item characteristics. We conclude that vocabulary scaffolds may be promising tools to facilitate reading comprehension and reduce performance differences amongst diverse learners in digital reading environments. Educational implications and recommendations for future research are discussed.