The language that children are exposed to in their early years is enhanced by children's picture books. It is important to better characterise this input, and recent research has begun to explore corpora of narrative picture books. However, previous research has been restricted by methodological limitations that make it difficult to develop large datasets. Further, information texts become increasingly important as children progress through school, but little is known about the language of their earliest form, namely, informational picture books. The current study investigates how informational and narrative picture book exposure might change the language environment of children in a way that supports reading development.
The study applies data science methods to build a larger language model than previously possible and investigates the lexical profile of over 2000 narrative and information picture books. Picture book vocabulary is innovatively derived from digital sources of books read-aloud online, which pushes the field forward by providing researchers access to larger pools of data than previously possible. Detailed comparisons of informational and narrative picture books are reported regarding their lexical diversity, density, morphology, academic vocabulary and semantic clusters. Models are developed to estimate the additional word-type exposure a child may encounter in their language environment from narrative and informational picture books.
The study demonstrates that information and narrative picture books expose children to substantially different semantic environments. It is demonstrated that information picture books provide extensive exposure to academic vocabulary, providing important input aligned with later reading needs. Further, computational models indicate that book reading once every day or second day over a year might boost unique-word exposure by approximately 10% for some language environments.
Combining informational and narrative picture books enhance the language environment of children more than narratives alone, providing more lexical diversity, density and complex morphology.