{"title":"基于深度神经网络的语音带宽扩展复谱图重构","authors":"Hongjiang Yu, Weiping Zhu","doi":"10.1109/newcas49341.2020.9159810","DOIUrl":null,"url":null,"abstract":"In this paper, we present a deep neural network (DNN) based complex spectrogram reconstruction algorithm for speech bandwidth expansion, where the DNN is applied for estimating the real and imaginary parts of spectrograms of the wideband speech from those of the narrowband speech. Unlike the previous DNN based method, which only estimates the magnitude and employs the simple mirror version phase for reconstruction, we employ the complex spectrogram to recover the magnitude and phase of the high-frequency component simultaneously. Experimental results demonstrate that our proposed method outperforms the non-negative matrix factorization (NMF) and the state-of-the-art DNN based speech bandwidth expansion methods in terms of objective performance metrics.","PeriodicalId":135163,"journal":{"name":"2020 18th IEEE International New Circuits and Systems Conference (NEWCAS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Neural Network based Complex Spectrogram Reconstruction for Speech Bandwidth Expansion\",\"authors\":\"Hongjiang Yu, Weiping Zhu\",\"doi\":\"10.1109/newcas49341.2020.9159810\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a deep neural network (DNN) based complex spectrogram reconstruction algorithm for speech bandwidth expansion, where the DNN is applied for estimating the real and imaginary parts of spectrograms of the wideband speech from those of the narrowband speech. Unlike the previous DNN based method, which only estimates the magnitude and employs the simple mirror version phase for reconstruction, we employ the complex spectrogram to recover the magnitude and phase of the high-frequency component simultaneously. Experimental results demonstrate that our proposed method outperforms the non-negative matrix factorization (NMF) and the state-of-the-art DNN based speech bandwidth expansion methods in terms of objective performance metrics.\",\"PeriodicalId\":135163,\"journal\":{\"name\":\"2020 18th IEEE International New Circuits and Systems Conference (NEWCAS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 18th IEEE International New Circuits and Systems Conference (NEWCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/newcas49341.2020.9159810\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 18th IEEE International New Circuits and Systems Conference (NEWCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/newcas49341.2020.9159810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Neural Network based Complex Spectrogram Reconstruction for Speech Bandwidth Expansion
In this paper, we present a deep neural network (DNN) based complex spectrogram reconstruction algorithm for speech bandwidth expansion, where the DNN is applied for estimating the real and imaginary parts of spectrograms of the wideband speech from those of the narrowband speech. Unlike the previous DNN based method, which only estimates the magnitude and employs the simple mirror version phase for reconstruction, we employ the complex spectrogram to recover the magnitude and phase of the high-frequency component simultaneously. Experimental results demonstrate that our proposed method outperforms the non-negative matrix factorization (NMF) and the state-of-the-art DNN based speech bandwidth expansion methods in terms of objective performance metrics.