{"title":"基于Wasserstein生成对抗网络的语音关键字检测","authors":"Wen Zhao, She Kun, Chen Hao","doi":"10.1109/ICMCCE51767.2020.00281","DOIUrl":null,"url":null,"abstract":"With the rapid development of artificial neural networks, it's applied to all areas of computer technologies. This paper combines deep neural network and keyword detection technology to propose a Wasserstein Generative Adversarial Network-based spoken keyword detection which is widely different from the existing methods. With the ability of Wasserstein Generative Adversarial Network (WGAN) to generates data autonomously, new sequences are generated, through which it analyzes whether keywords presence and where the keywords appear. In this method, the generator in WGAN fits the observation data to generate new data, and the discriminator classifies the generated data and the labels. The generator and discriminator are trained by combating learning. The method we propose is simple, does not require complex acoustic models, and does not need to be transcribed into text. It is also applicable to such languages without words. The TIMIT corpus and self-recorded Chinese corpus has been used for conducting experiments. Our method is compared with Convolutional Neural Network (CNN) and Deep Convolutional Generative Adversarial Network (DCGAN) and shows significant improvement over other techniques.","PeriodicalId":6712,"journal":{"name":"2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE)","volume":"23 1","pages":"1283-1288"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spoken Keyword Detection Based on Wasserstein Generative Adversarial Network\",\"authors\":\"Wen Zhao, She Kun, Chen Hao\",\"doi\":\"10.1109/ICMCCE51767.2020.00281\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of artificial neural networks, it's applied to all areas of computer technologies. This paper combines deep neural network and keyword detection technology to propose a Wasserstein Generative Adversarial Network-based spoken keyword detection which is widely different from the existing methods. With the ability of Wasserstein Generative Adversarial Network (WGAN) to generates data autonomously, new sequences are generated, through which it analyzes whether keywords presence and where the keywords appear. In this method, the generator in WGAN fits the observation data to generate new data, and the discriminator classifies the generated data and the labels. The generator and discriminator are trained by combating learning. The method we propose is simple, does not require complex acoustic models, and does not need to be transcribed into text. It is also applicable to such languages without words. The TIMIT corpus and self-recorded Chinese corpus has been used for conducting experiments. Our method is compared with Convolutional Neural Network (CNN) and Deep Convolutional Generative Adversarial Network (DCGAN) and shows significant improvement over other techniques.\",\"PeriodicalId\":6712,\"journal\":{\"name\":\"2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE)\",\"volume\":\"23 1\",\"pages\":\"1283-1288\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMCCE51767.2020.00281\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMCCE51767.2020.00281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spoken Keyword Detection Based on Wasserstein Generative Adversarial Network
With the rapid development of artificial neural networks, it's applied to all areas of computer technologies. This paper combines deep neural network and keyword detection technology to propose a Wasserstein Generative Adversarial Network-based spoken keyword detection which is widely different from the existing methods. With the ability of Wasserstein Generative Adversarial Network (WGAN) to generates data autonomously, new sequences are generated, through which it analyzes whether keywords presence and where the keywords appear. In this method, the generator in WGAN fits the observation data to generate new data, and the discriminator classifies the generated data and the labels. The generator and discriminator are trained by combating learning. The method we propose is simple, does not require complex acoustic models, and does not need to be transcribed into text. It is also applicable to such languages without words. The TIMIT corpus and self-recorded Chinese corpus has been used for conducting experiments. Our method is compared with Convolutional Neural Network (CNN) and Deep Convolutional Generative Adversarial Network (DCGAN) and shows significant improvement over other techniques.