{"title":"利用词法和次词法线索从儿童定向语音的转录词中分词。","authors":"Zébulon Goriely, Andrew Caines, Paula Buttery","doi":"10.1017/S0305000923000491","DOIUrl":null,"url":null,"abstract":"<p><p>We compare two frameworks for the segmentation of words in child-directed speech, PHOCUS and MULTICUE. PHOCUS is driven by lexical recognition, whereas MULTICUE combines sub-lexical properties to make boundary decisions, representing differing views of speech processing. We replicate these frameworks, perform novel benchmarking and confirm that both achieve competitive results. We develop a new framework for segmentation, the DYnamic Programming MULTIple-cue framework (DYMULTI), which combines the strengths of PHOCUS and MULTICUE by considering both sub-lexical and lexical cues when making boundary decisions. DYMULTI achieves state-of-the-art results and outperforms PHOCUS and MULTICUE on 15 of 26 languages in a cross-lingual experiment. As a model built on psycholinguistic principles, this validates DYMULTI as a robust model for speech segmentation and a contribution to the understanding of language acquisition.</p>","PeriodicalId":48132,"journal":{"name":"Journal of Child Language","volume":" ","pages":"1-41"},"PeriodicalIF":1.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues.\",\"authors\":\"Zébulon Goriely, Andrew Caines, Paula Buttery\",\"doi\":\"10.1017/S0305000923000491\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We compare two frameworks for the segmentation of words in child-directed speech, PHOCUS and MULTICUE. PHOCUS is driven by lexical recognition, whereas MULTICUE combines sub-lexical properties to make boundary decisions, representing differing views of speech processing. We replicate these frameworks, perform novel benchmarking and confirm that both achieve competitive results. We develop a new framework for segmentation, the DYnamic Programming MULTIple-cue framework (DYMULTI), which combines the strengths of PHOCUS and MULTICUE by considering both sub-lexical and lexical cues when making boundary decisions. DYMULTI achieves state-of-the-art results and outperforms PHOCUS and MULTICUE on 15 of 26 languages in a cross-lingual experiment. As a model built on psycholinguistic principles, this validates DYMULTI as a robust model for speech segmentation and a contribution to the understanding of language acquisition.</p>\",\"PeriodicalId\":48132,\"journal\":{\"name\":\"Journal of Child Language\",\"volume\":\" \",\"pages\":\"1-41\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Child Language\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1017/S0305000923000491\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/9/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Child Language","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1017/S0305000923000491","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"LINGUISTICS","Score":null,"Total":0}
Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues.
We compare two frameworks for the segmentation of words in child-directed speech, PHOCUS and MULTICUE. PHOCUS is driven by lexical recognition, whereas MULTICUE combines sub-lexical properties to make boundary decisions, representing differing views of speech processing. We replicate these frameworks, perform novel benchmarking and confirm that both achieve competitive results. We develop a new framework for segmentation, the DYnamic Programming MULTIple-cue framework (DYMULTI), which combines the strengths of PHOCUS and MULTICUE by considering both sub-lexical and lexical cues when making boundary decisions. DYMULTI achieves state-of-the-art results and outperforms PHOCUS and MULTICUE on 15 of 26 languages in a cross-lingual experiment. As a model built on psycholinguistic principles, this validates DYMULTI as a robust model for speech segmentation and a contribution to the understanding of language acquisition.
期刊介绍:
A key publication in the field, Journal of Child Language publishes articles on all aspects of the scientific study of language behaviour in children, the principles which underlie it, and the theories which may account for it. The international range of authors and breadth of coverage allow the journal to forge links between many different areas of research including psychology, linguistics, cognitive science and anthropology. This interdisciplinary approach spans a wide range of interests: phonology, phonetics, morphology, syntax, vocabulary, semantics, pragmatics, sociolinguistics, or any other recognised facet of language study.