{"title":"基于块的自回归语音参数估计的层次聚类和鲁棒识别","authors":"Ruofei Chen, C. Chan","doi":"10.1109/ISCSLP.2012.6423482","DOIUrl":null,"url":null,"abstract":"Given accurate system parameters like state transition matrix F and corruption mapping matrix H, clean speech autoregressive (AR) parameters can be effectively estimated from a series of noisy observations with Kalman filtering. In this paper, we address several fundamental issues to improve the linear dynamical system (LDS) based AR parameter estimation. A hierarchical time series clustering scheme is devised to truly group speech blocks with similar trajectories and corruption types. In addition, a correlated robust identification scheme using a posteriori signal-to-noise (SNR) mask is proposed to improve the identification accuracy. The effectiveness of the proposed clustering and identification scheme is evaluated in terms of spectral distortion between the Kalman estimates and the true clean speech parameters. Significant improvement is observed over the original matrix quantization (MQ) based approach. The proposed scheme is also successfully applied in a model-based speech enhancement application, and is expected to be effective in various codebook driven speech applications for robust identification purpose.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hierarchical clustering and robust identification for block-based autoregressive speech parameter estimation\",\"authors\":\"Ruofei Chen, C. Chan\",\"doi\":\"10.1109/ISCSLP.2012.6423482\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given accurate system parameters like state transition matrix F and corruption mapping matrix H, clean speech autoregressive (AR) parameters can be effectively estimated from a series of noisy observations with Kalman filtering. In this paper, we address several fundamental issues to improve the linear dynamical system (LDS) based AR parameter estimation. A hierarchical time series clustering scheme is devised to truly group speech blocks with similar trajectories and corruption types. In addition, a correlated robust identification scheme using a posteriori signal-to-noise (SNR) mask is proposed to improve the identification accuracy. The effectiveness of the proposed clustering and identification scheme is evaluated in terms of spectral distortion between the Kalman estimates and the true clean speech parameters. Significant improvement is observed over the original matrix quantization (MQ) based approach. The proposed scheme is also successfully applied in a model-based speech enhancement application, and is expected to be effective in various codebook driven speech applications for robust identification purpose.\",\"PeriodicalId\":186099,\"journal\":{\"name\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP.2012.6423482\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423482","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hierarchical clustering and robust identification for block-based autoregressive speech parameter estimation
Given accurate system parameters like state transition matrix F and corruption mapping matrix H, clean speech autoregressive (AR) parameters can be effectively estimated from a series of noisy observations with Kalman filtering. In this paper, we address several fundamental issues to improve the linear dynamical system (LDS) based AR parameter estimation. A hierarchical time series clustering scheme is devised to truly group speech blocks with similar trajectories and corruption types. In addition, a correlated robust identification scheme using a posteriori signal-to-noise (SNR) mask is proposed to improve the identification accuracy. The effectiveness of the proposed clustering and identification scheme is evaluated in terms of spectral distortion between the Kalman estimates and the true clean speech parameters. Significant improvement is observed over the original matrix quantization (MQ) based approach. The proposed scheme is also successfully applied in a model-based speech enhancement application, and is expected to be effective in various codebook driven speech applications for robust identification purpose.