{"title":"词汇前缀树与WFST: LVCSR中两种动态搜索概念的比较","authors":"David Rybach, H. Ney, R. Schlüter","doi":"10.1109/TASL.2013.2248723","DOIUrl":null,"url":null,"abstract":"Dynamic network decoders have the advantage of significantly lower memory consumption compared to static network decoders, especially when huge vocabularies and complex language models are required. This paper compares the properties of two well-known search strategies for dynamic network decoding, namely history conditioned lexical tree search and weighted finite-state transducer-based search using on-the-fly transducer composition. The two search strategies share many common principles like the use of dynamic programming, beam search, and many more. We point out the similarities of both approaches and investigate the implications of their differing features, both formally and experimentally, with a focus on implementation independent properties. Therefore, experimental results are obtained with a single decoder by representing the history conditioned lexical tree search strategy in the transducer framework. The properties analyzed cover structure and size of the search space, differences in hypotheses recombination, language model look-ahead techniques, and lattice generation.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2248723","citationCount":"11","resultStr":"{\"title\":\"Lexical Prefix Tree and WFST: A Comparison of Two Dynamic Search Concepts for LVCSR\",\"authors\":\"David Rybach, H. Ney, R. Schlüter\",\"doi\":\"10.1109/TASL.2013.2248723\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic network decoders have the advantage of significantly lower memory consumption compared to static network decoders, especially when huge vocabularies and complex language models are required. This paper compares the properties of two well-known search strategies for dynamic network decoding, namely history conditioned lexical tree search and weighted finite-state transducer-based search using on-the-fly transducer composition. The two search strategies share many common principles like the use of dynamic programming, beam search, and many more. We point out the similarities of both approaches and investigate the implications of their differing features, both formally and experimentally, with a focus on implementation independent properties. Therefore, experimental results are obtained with a single decoder by representing the history conditioned lexical tree search strategy in the transducer framework. The properties analyzed cover structure and size of the search space, differences in hypotheses recombination, language model look-ahead techniques, and lattice generation.\",\"PeriodicalId\":55014,\"journal\":{\"name\":\"IEEE Transactions on Audio Speech and Language Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TASL.2013.2248723\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Audio Speech and Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TASL.2013.2248723\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2248723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Lexical Prefix Tree and WFST: A Comparison of Two Dynamic Search Concepts for LVCSR
Dynamic network decoders have the advantage of significantly lower memory consumption compared to static network decoders, especially when huge vocabularies and complex language models are required. This paper compares the properties of two well-known search strategies for dynamic network decoding, namely history conditioned lexical tree search and weighted finite-state transducer-based search using on-the-fly transducer composition. The two search strategies share many common principles like the use of dynamic programming, beam search, and many more. We point out the similarities of both approaches and investigate the implications of their differing features, both formally and experimentally, with a focus on implementation independent properties. Therefore, experimental results are obtained with a single decoder by representing the history conditioned lexical tree search strategy in the transducer framework. The properties analyzed cover structure and size of the search space, differences in hypotheses recombination, language model look-ahead techniques, and lattice generation.
期刊介绍:
The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.