{"title":"Design of a Real-Time GAN based Speech Recognizer for Consumer Electronics","authors":"Pubali Roy, Pranav M Bidare, P. Bharadwaj, M. J","doi":"10.1109/ICICT57646.2023.10134295","DOIUrl":null,"url":null,"abstract":"Modern consumer electronics including automotive electronics, televisions, microwave ovens, music systems, refrigerators with speech controlled features and hands-free operation have spearheaded research in designing smart electronic devices for consumers. Real-time speech recognizer is the main module for these systems and a lot of research is in progress with the design of real-time speech recognizers with a quicker recognition time being considered as one of the challenges. Generative Adversarial Networks (GAN) are mainly used with two dimensional signals such as image for applications such as recognition, synthesis, translation etc. In this paper, an attempt is made to design and evaluate a real-time GAN based pattern recognizer for one-dimensional speech signal. In order to achieve this, the one-dimensional speech signal is first converted into a two dimensional spectrogram and fed to the GAN model for recognition. The proposed speech recognizer yielded a maximum recognition accuracy of 100% with a recognition time of 49.10ms per word. The proposed work can be easily employed to design various smart consumer electronics.","PeriodicalId":126489,"journal":{"name":"2023 International Conference on Inventive Computation Technologies (ICICT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Inventive Computation Technologies (ICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICT57646.2023.10134295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Modern consumer electronics including automotive electronics, televisions, microwave ovens, music systems, refrigerators with speech controlled features and hands-free operation have spearheaded research in designing smart electronic devices for consumers. Real-time speech recognizer is the main module for these systems and a lot of research is in progress with the design of real-time speech recognizers with a quicker recognition time being considered as one of the challenges. Generative Adversarial Networks (GAN) are mainly used with two dimensional signals such as image for applications such as recognition, synthesis, translation etc. In this paper, an attempt is made to design and evaluate a real-time GAN based pattern recognizer for one-dimensional speech signal. In order to achieve this, the one-dimensional speech signal is first converted into a two dimensional spectrogram and fed to the GAN model for recognition. The proposed speech recognizer yielded a maximum recognition accuracy of 100% with a recognition time of 49.10ms per word. The proposed work can be easily employed to design various smart consumer electronics.