光谱矢量量化的去相关变换

International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing Pub Date : 2013-07-01 DOI:10.1109/ICDSP.2013.6622682

M. A. Ramírez

{"title":"光谱矢量量化的去相关变换","authors":"M. A. Ramírez","doi":"10.1109/ICDSP.2013.6622682","DOIUrl":null,"url":null,"abstract":"Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.","PeriodicalId":88900,"journal":{"name":"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing","volume":"119 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decorrelating transforms for spectral vector quantization\",\"authors\":\"M. A. Ramírez\",\"doi\":\"10.1109/ICDSP.2013.6622682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.\",\"PeriodicalId\":88900,\"journal\":{\"name\":\"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing\",\"volume\":\"119 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSP.2013.6622682\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2013.6622682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

分割矢量量化(SVQ)对线谱频率(LSF)量化有较好的效果，但缺少部分分量相关性。切换支持向量机(SSVQ)可以通过高斯混合模型(GMM)恢复非线性依赖所带来的一些优势。剩余的线性依赖关系或向量组件之间的相关性可以通过转换编码来利用。通常使用Karhunen-Loeve变换(KLT)，但特征分解和全变换矩阵使其计算复杂。然而，一组变换最近被描述为源协方差矩阵的广义三角分解(GTD)的能力。基于预测的下三角变换(PLT)是这些变换中最不复杂的，并且是所有这些变换实现中的一个组成部分。提出了一种用于PLT SVQ的最小噪声结构。宽带语音的16维LSF矢量编码结果表明，GMM PLT SSVQ可以实现低至41比特/帧的透明量化，失真性能接近GMM KLT SSVQ，操作复杂度约为GMM KLT SSVQ的四分之三。GTD家族的其他成员，如几何平均分解(GMD)变换和双对角线(BID)变换，由于在测试范围内每个分量的低比特率，无法充分利用其优势特性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Decorrelating transforms for spectral vector quantization

Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Digital Signal Processing proceedings : DSP. International Conference on Digital Signal Processing

自引率

0.00%

发文量