{"title":"基于Sincnet-CNN模型的原始语音孤立词识别研究","authors":"Gao Hu, Qingwei Zeng, Chao Long, Dianyou Geng","doi":"10.1109/ISPDS56360.2022.9874177","DOIUrl":null,"url":null,"abstract":"In order to effectively speed up the model training time, reduce the model training parameters and improve the accuracy of raw speech isolated word recognition. An interpretable convolutional filter structure (sincnet) combined with convolutional neural network (CNN) is proposed for the task of raw speech isolated word recognition. On the premise of ensuring the speech recognition rate, the model structure becomes lightweight and the computational complexity is reduced. The experimental results show that compared with the traditional neural network model, the proposed model can effectively improve the performance of raw speech isolated word recognition.","PeriodicalId":280244,"journal":{"name":"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on raw speech isolated word recognition based on Sincnet-CNN model\",\"authors\":\"Gao Hu, Qingwei Zeng, Chao Long, Dianyou Geng\",\"doi\":\"10.1109/ISPDS56360.2022.9874177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to effectively speed up the model training time, reduce the model training parameters and improve the accuracy of raw speech isolated word recognition. An interpretable convolutional filter structure (sincnet) combined with convolutional neural network (CNN) is proposed for the task of raw speech isolated word recognition. On the premise of ensuring the speech recognition rate, the model structure becomes lightweight and the computational complexity is reduced. The experimental results show that compared with the traditional neural network model, the proposed model can effectively improve the performance of raw speech isolated word recognition.\",\"PeriodicalId\":280244,\"journal\":{\"name\":\"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPDS56360.2022.9874177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDS56360.2022.9874177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on raw speech isolated word recognition based on Sincnet-CNN model
In order to effectively speed up the model training time, reduce the model training parameters and improve the accuracy of raw speech isolated word recognition. An interpretable convolutional filter structure (sincnet) combined with convolutional neural network (CNN) is proposed for the task of raw speech isolated word recognition. On the premise of ensuring the speech recognition rate, the model structure becomes lightweight and the computational complexity is reduced. The experimental results show that compared with the traditional neural network model, the proposed model can effectively improve the performance of raw speech isolated word recognition.