Ming Sun, A. Schwarz, Minhua Wu, N. Strom, S. Matsoukas, S. Vitaladevuni
{"title":"跨语言迁移学习技术在小足迹关键词识别中的实证研究","authors":"Ming Sun, A. Schwarz, Minhua Wu, N. Strom, S. Matsoukas, S. Vitaladevuni","doi":"10.1109/ICMLA.2017.0-150","DOIUrl":null,"url":null,"abstract":"This paper presents our work on building a small-footprint keyword spotting system for a resource-limited language, which requires low CPU, memory and latency. Our keyword spotting system consists of deep neural network (DNN) and hidden Markov model (HMM), which is a hybrid DNN-HMM decoder. We investigate different transfer learning techniques to leverage knowledge and data from a resource-abundant source language to improve the keyword DNN training for a target language which has limited in-domain data. The approaches employed in this paper include training a DNN using source language data to initialize the target language DNN training, mixing data from source and target languages together in a multi-task DNN training setup, using logits computed from a DNN trained on the source language data to regularize the keyword DNN training in the target language, as well as combinations of these techniques. Given different amounts of target language training data, our experimental results show that these transfer learning techniques successfully improve keyword spotting performance for the target language, measured by the area under the curve (AUC) of DNN-HMM decoding detection error tradeoff (DET) curves using a large in-house far-field test set.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"22 1","pages":"255-260"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"An Empirical Study of Cross-Lingual Transfer Learning Techniques for Small-Footprint Keyword Spotting\",\"authors\":\"Ming Sun, A. Schwarz, Minhua Wu, N. Strom, S. Matsoukas, S. Vitaladevuni\",\"doi\":\"10.1109/ICMLA.2017.0-150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents our work on building a small-footprint keyword spotting system for a resource-limited language, which requires low CPU, memory and latency. Our keyword spotting system consists of deep neural network (DNN) and hidden Markov model (HMM), which is a hybrid DNN-HMM decoder. We investigate different transfer learning techniques to leverage knowledge and data from a resource-abundant source language to improve the keyword DNN training for a target language which has limited in-domain data. The approaches employed in this paper include training a DNN using source language data to initialize the target language DNN training, mixing data from source and target languages together in a multi-task DNN training setup, using logits computed from a DNN trained on the source language data to regularize the keyword DNN training in the target language, as well as combinations of these techniques. Given different amounts of target language training data, our experimental results show that these transfer learning techniques successfully improve keyword spotting performance for the target language, measured by the area under the curve (AUC) of DNN-HMM decoding detection error tradeoff (DET) curves using a large in-house far-field test set.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"22 1\",\"pages\":\"255-260\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.0-150\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.0-150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Empirical Study of Cross-Lingual Transfer Learning Techniques for Small-Footprint Keyword Spotting
This paper presents our work on building a small-footprint keyword spotting system for a resource-limited language, which requires low CPU, memory and latency. Our keyword spotting system consists of deep neural network (DNN) and hidden Markov model (HMM), which is a hybrid DNN-HMM decoder. We investigate different transfer learning techniques to leverage knowledge and data from a resource-abundant source language to improve the keyword DNN training for a target language which has limited in-domain data. The approaches employed in this paper include training a DNN using source language data to initialize the target language DNN training, mixing data from source and target languages together in a multi-task DNN training setup, using logits computed from a DNN trained on the source language data to regularize the keyword DNN training in the target language, as well as combinations of these techniques. Given different amounts of target language training data, our experimental results show that these transfer learning techniques successfully improve keyword spotting performance for the target language, measured by the area under the curve (AUC) of DNN-HMM decoding detection error tradeoff (DET) curves using a large in-house far-field test set.