{"title":"基于深度神经网络的鲁棒语音特征提取及基于改进平均预测LMS滤波的语音识别降噪方法","authors":"Sangshin Oh","doi":"10.22156/CS4SMB.2021.11.06.001","DOIUrl":null,"url":null,"abstract":"In the field of speech recognition, as the DNN is applied, the use of speech recognition is increasing, but the amount of calculation for parallel training needs to be larger than that of the conventional GMM , and if the amount of data is small, overfitting occurs. To solve this problem, we propose an efficient method for robust voice feature extraction and voice signal noise removal even when the amount of data is small. Speech feature extraction efficiently extracts speech energy by applying the difference in frame energy for speech and the zero-crossing ratio and level-crossing ratio that are affected by the speech signal. In addition, in order to remove noise, the noise of the speech signal is removed by removing the noise of the speech signal with an average predictive improved LMS filter with little loss of speech information while maintaining the intrinsic characteristics of speech in detection of the speech signal . The improved LMS filter uses a method of processing noise on the input speech signal by adjusting the active parameter threshold for the input signal. As a result of comparing the method proposed in this paper with the conventional frame energy method, it was confirmed that the error rate at the start point of speech is 7% and the error rate at the end point is improved by 11%.","PeriodicalId":15438,"journal":{"name":"Journal of Convergence Information Technology","volume":"42 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"DNN based Robust Speech Feature Extraction and Signal Noise Removal Method Using Improved Average Prediction LMS Filter for Speech Recognition\",\"authors\":\"Sangshin Oh\",\"doi\":\"10.22156/CS4SMB.2021.11.06.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of speech recognition, as the DNN is applied, the use of speech recognition is increasing, but the amount of calculation for parallel training needs to be larger than that of the conventional GMM , and if the amount of data is small, overfitting occurs. To solve this problem, we propose an efficient method for robust voice feature extraction and voice signal noise removal even when the amount of data is small. Speech feature extraction efficiently extracts speech energy by applying the difference in frame energy for speech and the zero-crossing ratio and level-crossing ratio that are affected by the speech signal. In addition, in order to remove noise, the noise of the speech signal is removed by removing the noise of the speech signal with an average predictive improved LMS filter with little loss of speech information while maintaining the intrinsic characteristics of speech in detection of the speech signal . The improved LMS filter uses a method of processing noise on the input speech signal by adjusting the active parameter threshold for the input signal. As a result of comparing the method proposed in this paper with the conventional frame energy method, it was confirmed that the error rate at the start point of speech is 7% and the error rate at the end point is improved by 11%.\",\"PeriodicalId\":15438,\"journal\":{\"name\":\"Journal of Convergence Information Technology\",\"volume\":\"42 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Convergence Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22156/CS4SMB.2021.11.06.001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Convergence Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22156/CS4SMB.2021.11.06.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DNN based Robust Speech Feature Extraction and Signal Noise Removal Method Using Improved Average Prediction LMS Filter for Speech Recognition
In the field of speech recognition, as the DNN is applied, the use of speech recognition is increasing, but the amount of calculation for parallel training needs to be larger than that of the conventional GMM , and if the amount of data is small, overfitting occurs. To solve this problem, we propose an efficient method for robust voice feature extraction and voice signal noise removal even when the amount of data is small. Speech feature extraction efficiently extracts speech energy by applying the difference in frame energy for speech and the zero-crossing ratio and level-crossing ratio that are affected by the speech signal. In addition, in order to remove noise, the noise of the speech signal is removed by removing the noise of the speech signal with an average predictive improved LMS filter with little loss of speech information while maintaining the intrinsic characteristics of speech in detection of the speech signal . The improved LMS filter uses a method of processing noise on the input speech signal by adjusting the active parameter threshold for the input signal. As a result of comparing the method proposed in this paper with the conventional frame energy method, it was confirmed that the error rate at the start point of speech is 7% and the error rate at the end point is improved by 11%.