Learning sound categories is central to language acquisition – but we know little about the extent of phonetic variability in the learner’s input. In this study, we phonetically annotated coronal segments (/t/, /d/, /s/, /z/, and /n/) in a corpus of naturalistic American English infant-directed speech (IDS). We did not find evidence that IDS is consistently more canonical than adult-directed speech (ADS), challenging the notion of IDS as a learning register. While IDS is not more canonical than ADS overall, the canonical form was nonetheless the most frequent form in IDS for all segments except /t/. We also considered how infants may move beyond the task of identifying the canonical form to how they may learn to cluster allophones; for this purpose, we quantified the dissimilarity in the phonological environments of the variants in question. Lastly, we investigated a case in which the overwhelming majority of instantiations were not canonical – word-final t and d – and demonstrated that morphologically-conditioned suffixes were more canonical than other word final segments. This corpus is a vital step towards understanding how infants can learn to categorize sounds from their input and will be an invaluable tool for future sociolinguistic, computational and theoretical modeling of language learning.