{"title":"光谱矢量量化的去相关变换","authors":"M. A. Ramírez","doi":"10.1109/ICDSP.2013.6622682","DOIUrl":null,"url":null,"abstract":"Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.","PeriodicalId":88900,"journal":{"name":"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing","volume":"119 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decorrelating transforms for spectral vector quantization\",\"authors\":\"M. A. Ramírez\",\"doi\":\"10.1109/ICDSP.2013.6622682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.\",\"PeriodicalId\":88900,\"journal\":{\"name\":\"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing\",\"volume\":\"119 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSP.2013.6622682\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2013.6622682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decorrelating transforms for spectral vector quantization
Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.