Syu-Siang Wang, Jeremy Chiaming Yang, Yu Tsao, J. Hung
{"title":"利用非负矩阵分解处理时序调制频谱以增强语音","authors":"Syu-Siang Wang, Jeremy Chiaming Yang, Yu Tsao, J. Hung","doi":"10.1109/ICCE-TW.2016.7521042","DOIUrl":null,"url":null,"abstract":"This paper proposes to employ the technique of nonnegative matrix factorization (NMF) to enhance the temporal modulation components of speech signals for reducing the noisy effect. As for any arbitrary acoustic frequency, the NMF-wise bases for the temporal modulations of both the clean speech and noise are first extracted and then applied to the decomposition of the temporal modulation of the noise-corrupted speech signal. In this way the noise-free speech component can be highlighted and the updated speech signal possesses higher quality than the original counterpart. Moreover, the temporal modulations of the neighboring acoustic frequencies can be processed together to boost the computation efficiency without deteriorating the enhancement. The evaluation experiments conducted on a subset of the Aurora-2 connected digit database reveal that the proposed method significantly improves the Perceptual Evaluation of Speech Quality (PESQ) scores of the signals.","PeriodicalId":6620,"journal":{"name":"2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW)","volume":"1 1","pages":"1-2"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging nonnegative matrix factorization in processing the temporal modulation spectrum for speech enhancement\",\"authors\":\"Syu-Siang Wang, Jeremy Chiaming Yang, Yu Tsao, J. Hung\",\"doi\":\"10.1109/ICCE-TW.2016.7521042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes to employ the technique of nonnegative matrix factorization (NMF) to enhance the temporal modulation components of speech signals for reducing the noisy effect. As for any arbitrary acoustic frequency, the NMF-wise bases for the temporal modulations of both the clean speech and noise are first extracted and then applied to the decomposition of the temporal modulation of the noise-corrupted speech signal. In this way the noise-free speech component can be highlighted and the updated speech signal possesses higher quality than the original counterpart. Moreover, the temporal modulations of the neighboring acoustic frequencies can be processed together to boost the computation efficiency without deteriorating the enhancement. The evaluation experiments conducted on a subset of the Aurora-2 connected digit database reveal that the proposed method significantly improves the Perceptual Evaluation of Speech Quality (PESQ) scores of the signals.\",\"PeriodicalId\":6620,\"journal\":{\"name\":\"2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW)\",\"volume\":\"1 1\",\"pages\":\"1-2\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE-TW.2016.7521042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE-TW.2016.7521042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Leveraging nonnegative matrix factorization in processing the temporal modulation spectrum for speech enhancement
This paper proposes to employ the technique of nonnegative matrix factorization (NMF) to enhance the temporal modulation components of speech signals for reducing the noisy effect. As for any arbitrary acoustic frequency, the NMF-wise bases for the temporal modulations of both the clean speech and noise are first extracted and then applied to the decomposition of the temporal modulation of the noise-corrupted speech signal. In this way the noise-free speech component can be highlighted and the updated speech signal possesses higher quality than the original counterpart. Moreover, the temporal modulations of the neighboring acoustic frequencies can be processed together to boost the computation efficiency without deteriorating the enhancement. The evaluation experiments conducted on a subset of the Aurora-2 connected digit database reveal that the proposed method significantly improves the Perceptual Evaluation of Speech Quality (PESQ) scores of the signals.