1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)最新文献

英文中文

LPC quantization requirements for the GPP-CELP coder LPC量化要求的GPP-CELP编码器

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781477

P. Mermelstein, Y. Qian, K. Zarrinkoub

Code-excited linear prediction coding with generalized pitch prediction (GPP-CELP) requires linear prediction filtering of the stochastic codebook output prior to addition of the adaptive codebook (ACE) component. The ACE component represents a sequence of past reconstructed samples passed through a low-pass filter to reflect the reduced pitch periodicity of the higher speech frequencies. The spectrum of the residual manifests broad peaks leading to significantly narrower distributions in the LPC parameter space. Additionally, the quantization error of the residual may be masked by the significantly greater energy of the ACE component. This work compares the quantization requirements for the information required to represent the time-varying LPC filter of the GPP-CELP coder with that of the classical CELP coder. With non-predictive coding of the LPC information a bit-rate reduction from 20 bits/20 ms to 16 bits/20 ms appears feasible without introducing noticeable degradation due to quantization.

基于广义基音预测的码激励线性预测编码(GPP-CELP)要求在加入自适应码本(ACE)分量之前对随机码本输出进行线性预测滤波。ACE分量表示经过低通滤波器的过去重构样本序列，以反映较高语音频率的降低音调周期性。残差谱表现为宽峰，导致LPC参数空间的分布明显变窄。此外，残差的量化误差可能被ACE分量的显著更大的能量所掩盖。本工作比较了GPP-CELP编码器的时变LPC滤波器与经典CELP编码器的时变LPC滤波器所需信息的量化要求。对于LPC信息的非预测编码，比特率从20比特/20毫秒降低到16比特/20毫秒似乎是可行的，而不会由于量化而引起明显的退化。

引用次数: 1

Parametric speech coding-HVXC at 2.0-4.0 kbps 参数语音编码- hvxc在2.0-4.0 kbps

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781492

M. Nishiguchi, A. Inoue, Y. Maeda, J. Matsumoto

MPEG-4 parametric speech coding, harmonic vector excitation coding (HVXC) algorithm, is described. New features of the coder includes a quantizer scheme capable of generating 2.0 and 4.0 kbps scalable bit-streams, where 2.0 kbps decoding is possible using a subset of 4.0 kbps bit-stream. Time scale modification of speech is also possible without changing pitch nor phoneme for fast and slow playback mode. Listening tests show that the proposed coding method at 2.0 kbps provides significantly better quality than that of FS1016 CELP at 4.8 kbps. In October 1998, the HVXC coder was adopted to the Final Draft International Standard (FDIS) of MPEG-4 standardization.

介绍了MPEG-4参数化语音编码，谐波矢量激励编码(HVXC)算法。编码器的新功能包括能够生成2.0和4.0 kbps可扩展比特流的量化器方案，其中使用4.0 kbps比特流的子集可以实现2.0 kbps解码。语音的时间尺度修改也可以不改变音高或音素的快速和缓慢播放模式。听力测试表明，2.0 kbps的编码方法明显优于4.8 kbps的FS1016 CELP编码方法。1998年10月，HVXC编码器被采纳为MPEG-4标准化的最终国际标准草案(FDIS)。

引用次数: 16

A novel pitch-lag search method using adaptive weighting and median filtering 基于自适应加权和中值滤波的音高滞后搜索方法

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781502

P. Ojala, P. Haavisto, A. Lakaniemi, J. Vainio

This paper presents a novel method to estimate the pitch-lag in a speech codec. The pitch-lag is related to the fundamental frequency of the speech signal and an accurate estimation of this parameter is important for the subjective quality of the synthesised speech. A common problem in speech codecs is that the estimation of the pitch-lag often produces a multiple or a sub-multiple of the true pitch value. When these incorrect pitch-lag values are used in speech synthesis the subjective quality of the speech is degraded. This paper presents an improved method where the estimation of the pitch-lag parameter is biased towards the pitch-lag values of the previous speech segments resulting in a consistent set of consecutive pitch-lag values and a high quality reconstructed signal. The classification of speech into voiced and unvoiced parts is used when tracking the pitch-lag values and adapting the pitch track centered weighting function.

提出了一种估计语音编解码器中音高滞后的新方法。音高滞后与语音信号的基频有关，对该参数的准确估计对于合成语音的主观质量非常重要。语音编解码器的一个常见问题是，对音高滞后的估计通常会产生真实音高值的倍数或次倍数。当这些不正确的音高滞后值被用于语音合成时，语音的主观质量就会下降。本文提出了一种改进的方法，该方法对音高滞后参数的估计偏向于先前语音片段的音高滞后值，从而得到一组一致的连续音高滞后值和高质量的重构信号。在跟踪音高滞后值和采用音高轨道中心加权函数时，将语音分为浊音部分和不浊音部分。

引用次数: 3

Optimized error correction of MELP speech parameters via maximum a posteriori (MAP) techniques 利用最大后验(MAP)技术优化MELP语音参数的纠错

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781490

D.J. Rahikka, T. Fuja, T. Fazel

The U.S. Government has developed and adopted a new Federal standard vocoder which operates at 2400 bps and is called MELP-mixed excitation linear prediction. This algorithm has quite good voice quality under benign error channel conditions. However, when subjected to high error conditions as may be experienced in vehicular applications, correction techniques may be employed which utilize the underlying inter-frame residual redundancy of the MELP parameters. This paper describes experiments conducted on the MELP algorithm when combined with Viterbi convolutional error decoding, and enhanced with maximum a posteriori techniques which capitalize on the redundancy statistics. Both hard and soft Viterbi decoding situations are investigated.

美国政府开发并采用了一种新的联邦标准声码器，其工作速度为2400bps，称为melp混合激励线性预测。在良性误差信道条件下，该算法具有较好的语音质量。然而，当受到车辆应用中可能经历的高误差条件时，可以采用校正技术，利用MELP参数的潜在帧间剩余冗余。本文描述了MELP算法与Viterbi卷积错误解码相结合的实验，并利用冗余统计的最大后验技术进行了增强。研究了硬维特比译码和软维特比译码的情况。

引用次数: 7

Wideband speech coding using forward/backward adaptive prediction with mixed time/frequency domain excitation 基于时频混合激励的前向/后向自适应宽带语音编码

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781465

J. Schnitzler, J. Eggers, C. Erdmann, P. Vary

This paper describes a wideband (7 kHz) speech coding scheme using code-excited linear prediction (CELP) with mixed time and frequency domain excitation. The proposed frequency domain innovation can be used alternatively or in parallel to a time domain codebook. In addition an improved synthesis filter is used consisting of a signal dependent combination of a forward adaptive and a backward adaptive (FA/BA) structure. An experimental codec operating at 15.5 or 20.0 kbit/s is demonstrated.

本文提出了一种采用码激励线性预测(CELP)的时域和频域混合激励的宽带(7khz)语音编码方案。所提出的频域创新可以与时域码本交替或并行使用。此外，采用了一种由前向自适应和后向自适应(FA/BA)结构的信号依赖组合组成的改进的合成滤波器。演示了一种工作在15.5或20.0 kbit/s的实验编解码器。

引用次数: 5

Enhanced waveform interpolative coding at 4 kbps 增强波形插值编码在4 kbps

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781494

O. Gottesman, A. Gersho

This paper presents an enhanced waveform interpolative (EWI) speech coder at 4 kbps. The system incorporates novel features such as analysis-by-synthesis (AbS) vector-quantization (VQ) of the dispersion-phase, AbS optimization of the slowly evolving waveform (SEW), a special pitch search for transitions, and switched-predictive analysis-by-synthesis gain VQ. Subjective quality tests indicate that it exceeds that of MPEG-4 at 4 kbps and of G.723.1 at 5.3 kbps, and it is slightly better than that of G.723.1 at 6.3 kbps.

提出了一种速度为4kbps的增强型波形插值语音编码器。该系统集成了新的特性，如色散相位的合成分析(AbS)矢量量化(VQ)、慢演变波形的AbS优化(SEW)、过渡的特殊音高搜索以及切换预测合成分析增益VQ。主观质量测试表明，它超过了MPEG-4的4 kbps和G.723.1的5.3 kbps，略好于G.723.1的6.3 kbps。

引用次数: 17

Voice activity detection for GSM adaptive multi-rate codec GSM自适应多速率编解码器语音活动检测

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781482

A. Vahatalo, I. Johansson

This paper describes the VAD (voice activity detection) for controlling DTX (discontinuous transmission) of the GSM AMR (adaptive multi-rate) speech codec. The algorithm is based on spectral estimation and periodicity detection. The VAD contains a 9-band IIR filter bank, which divides input signals into frequency bands. The signal level at each band is calculated. Background noise is estimated in each sub-band. The VAD decision is computed by comparing input signal level and background noise estimate. The algorithm incorporates novel methods to estimate background noise and to detect periodic components based on open-loop pitch gain. A new method is also derived to detect correlated complex signals like music.

本文介绍了用于控制GSM自适应多速率语音编解码器DTX(不连续传输)的VAD(语音活动检测)技术。该算法基于频谱估计和周期性检测。VAD包含一个9波段IIR滤波器组，它将输入信号划分为多个频段。计算每个频段的信号电平。在每个子带中估计背景噪声。通过比较输入信号电平和背景噪声估计来计算VAD决策。该算法采用了新的方法来估计背景噪声和基于开环螺距增益的周期分量检测。提出了一种检测音乐等相关复杂信号的新方法。

引用次数: 14

On waveform-interpolation coding with asymptotically perfect reconstruction 具有渐近完美重构的波形插值编码

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781495

T. Eriksson, W. Kleijn

For coders which must produce high speech quality, it is beneficial to have a coding structure which gives zero distortion in the waveform when the quantizer error vanishes (asymptotically perfect reconstruction, APR). It is possible to introduce this property to waveform interpolation (WI) coders by using perfect reconstruction filter banks for analysis and synthesis. Unfortunately, the perfect-reconstruction filter banks are, in general, associated with disadvantages such as oversampling, a loss of physical meaning of the parameters, and increased delay. These disadvantages disappear for the filter bank based on the block DFT transform, but the latter method suffers from energy discontinuities. By using a pre-processor in combination with a block-DFT based WI coder, a coding structure is obtained which maintains the advantages of earlier WI coders and adds the APR property. This new structure is most useful for higher rate WI coders.

对于必须产生高语音质量的编码器，当量化器误差消失时，具有波形零失真的编码结构(渐近完美重构，APR)是有益的。通过使用完美的重构滤波器组进行分析和合成，可以将这一特性引入波形插值(WI)编码器。不幸的是，完美重构滤波器组通常具有过采样、参数物理意义丢失和延迟增加等缺点。基于块DFT变换的滤波器组消除了这些缺点，但后者存在能量不连续的问题。将预处理器与基于块dft的WI编码器相结合，得到了一种既保留了早期WI编码器的优点又增加了APR特性的编码结构。这种新结构对高速率WI编码器最有用。

引用次数: 11

Recursive coding of spectrum parameters 频谱参数的递归编码

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781476

J. Samuelsson, P. Hedelin

Estimates of optimal performance in terms of spectral distortion (SD) for first order time-recursive spectrum coders are presented. Extensions of high rate theory provides us with the formulas to calculate estimates and also tells us how to design coders with optimal VQ point density. For this purpose, the PDF of the current spectrum parameter vector, given the previous, is needed. This conditional PDF is obtained analytically from a model PDF for pairs of consecutive parameter vectors, based on Gaussian mixtures. The theory gives a lower bound of 16 bits to achieve 1 dB SD. Practical coders must base the adaptive codebook design on quantized previous vectors and experiments suggest that another 2-3 bits is needed to achieve 1 dB SD. Informal subjective tests indicate that transparent quality may be maintained at even lower rates.

给出了一阶时间递归频谱编码器在频谱失真(SD)方面的最优性能估计。高速率理论的扩展为我们提供了估计的计算公式，并告诉我们如何设计具有最佳VQ点密度的编码器。为此，在给定前一种情况下，需要当前频谱参数矢量的PDF。基于高斯混合，从连续参数向量对的模型PDF中解析得到了条件PDF。该理论给出了16位的下限以实现1db SD。实际的编码器必须将自适应码本设计基于量化的先前向量，实验表明需要另外2-3位才能实现1 dB SD。非正式的主观测试表明，透明的质量可能以更低的比率保持。

引用次数: 58

Design of test sequences for G.729 Annex E G.729测试序列设计

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

Pub Date : 1999-06-20 DOI: 10.1109/SCFT.1999.781504

S. Ragot, R. Salami, R. Lefebvre

The 11.8 kb/s extension of the G.729 codec, also known as Annex E of the G.729 Recommendation, has been ratified by the ITU-T. This paper describes how the related test sequences have been designed, using the fixed-point C simulation of the codec. The design method is based on the concept of coverage, already used in the design of test sequences for the G.729 codec. Coverage ensures that all possible parameter values are observed in the bitstream, and all portions of the algorithm are executed at least once. Experiments showed that this approach guarantees a satisfying reliability.

G.729编解码器的11.8 kb/s扩展，也称为G.729建议书的附件E，已被ITU-T批准。本文介绍了如何利用编解码器的定点C仿真设计相关的测试序列。该设计方法基于覆盖的概念，已经用于G.729编解码器的测试序列设计。覆盖确保在比特流中观察到所有可能的参数值，并且算法的所有部分至少执行一次。实验表明，该方法具有较好的可靠性。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀