{"title":"基于深度神经网络的混响时间盲估计","authors":"Myungin Lee, Joon‐Hyuk Chang","doi":"10.1109/ICNIDC.2016.7974586","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a method to estimate reverberation time (T60) from the observed reverberant speech signal using deep neural network (DNN). Reverberation of speech signal is a critical issue in speech processing as the reverberation results smearing of the sound characteristics in both temporal and spectral domain resulting unfavorable effects on the performance of speech processing algorithms. Employing room acoustic characteristics of a reverberant speech can enhance the performance of the speech processing system so that the blind estimation of reverberation time has been studied based on the numerical interpretation of reverberation. In this paper, we adopt the speech decay rate and its distribution for each frequency bin as input feature vectors of DNN. Complex relation between each input feature vector and each T60 target label through multiple nonlinear hidden layers. We also introduce an approach to mitigate the computational complexity whilst maintaining rational performance.","PeriodicalId":439987,"journal":{"name":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Blind estimation of reverberation time using deep neural network\",\"authors\":\"Myungin Lee, Joon‐Hyuk Chang\",\"doi\":\"10.1109/ICNIDC.2016.7974586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a method to estimate reverberation time (T60) from the observed reverberant speech signal using deep neural network (DNN). Reverberation of speech signal is a critical issue in speech processing as the reverberation results smearing of the sound characteristics in both temporal and spectral domain resulting unfavorable effects on the performance of speech processing algorithms. Employing room acoustic characteristics of a reverberant speech can enhance the performance of the speech processing system so that the blind estimation of reverberation time has been studied based on the numerical interpretation of reverberation. In this paper, we adopt the speech decay rate and its distribution for each frequency bin as input feature vectors of DNN. Complex relation between each input feature vector and each T60 target label through multiple nonlinear hidden layers. We also introduce an approach to mitigate the computational complexity whilst maintaining rational performance.\",\"PeriodicalId\":439987,\"journal\":{\"name\":\"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNIDC.2016.7974586\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNIDC.2016.7974586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Blind estimation of reverberation time using deep neural network
In this paper, we propose a method to estimate reverberation time (T60) from the observed reverberant speech signal using deep neural network (DNN). Reverberation of speech signal is a critical issue in speech processing as the reverberation results smearing of the sound characteristics in both temporal and spectral domain resulting unfavorable effects on the performance of speech processing algorithms. Employing room acoustic characteristics of a reverberant speech can enhance the performance of the speech processing system so that the blind estimation of reverberation time has been studied based on the numerical interpretation of reverberation. In this paper, we adopt the speech decay rate and its distribution for each frequency bin as input feature vectors of DNN. Complex relation between each input feature vector and each T60 target label through multiple nonlinear hidden layers. We also introduce an approach to mitigate the computational complexity whilst maintaining rational performance.