首页 > 最新文献

1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)最新文献

英文 中文
New speech enhancement techniques for low bit rate speech coding 低比特率语音编码的新语音增强技术
R. Martin, R. Cox
In this paper we present novel solutions for pre-processing noisy speech prior to low bit rate speech coding. We strive especially to improve the estimation of spectral parameters and to reduce the additional algorithmic delay caused by the enhancement pre-processor. While the former is achieved using a new adaptive limiting algorithm for the a priori signal-to-noise ratio (SNR) estimate, the latter makes use of a novel overlap/add scheme. Our enhancement techniques were evaluated in conjunction with the 2400 bps mixed excitation linear prediction (MELP) coder by means of formal and informal listening tests.
本文提出了在低比特率语音编码之前对噪声语音进行预处理的新方法。我们特别努力改进谱参数的估计,并减少由增强预处理器引起的额外算法延迟。前者使用一种新的自适应限制算法来实现先验信噪比(SNR)估计,后者使用一种新的重叠/添加方案。通过正式和非正式的听力测试,将我们的增强技术与2400 bps混合激励线性预测(MELP)编码器结合起来进行评估。
{"title":"New speech enhancement techniques for low bit rate speech coding","authors":"R. Martin, R. Cox","doi":"10.1109/SCFT.1999.781519","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781519","url":null,"abstract":"In this paper we present novel solutions for pre-processing noisy speech prior to low bit rate speech coding. We strive especially to improve the estimation of spectral parameters and to reduce the additional algorithmic delay caused by the enhancement pre-processor. While the former is achieved using a new adaptive limiting algorithm for the a priori signal-to-noise ratio (SNR) estimate, the latter makes use of a novel overlap/add scheme. Our enhancement techniques were evaluated in conjunction with the 2400 bps mixed excitation linear prediction (MELP) coder by means of formal and informal listening tests.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117061748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Trellis code excited linear prediction (TCELP) speech coding 网格码激励线性预测(TCELP)语音编码
Cheng-Chieh Lee, Y. Shoham
This paper describes using the trellis-based scalar-vector quantizer for sources with memory to solve the excitation codebook search problem of code excited linear prediction (CELP) speech coders. This approach leads to a 24 kbit/s telephony-bandwidth low-delay (3 msec) trellis CELP coder, which outperforms both ITU-T 15 kbit/s G.728 LD-CELP and G.726 32 kbit/s ADPCM. Since the codebook is derived from a scalar alphabet, the proposed coder can effectively handle excitation vectors in the 24-dimensional space (to realize considerable vector quantization gains) and has a computational complexity of approximately 75% of that of ITU-T G.728 LD-CELP.
本文介绍了用基于网格的有内存源标量矢量量化器来解决码激励线性预测(CELP)语音编码器的激励码本搜索问题。这种方法产生了24 kbit/s的电话带宽低延迟(3 msec)栅格CELP编码器,其性能优于ITU-T 15 kbit/s G.728 LD-CELP和G.726 32 kbit/s ADPCM。由于码本来源于标量字母表,因此所提出的编码器可以有效地处理24维空间中的激励矢量(实现可观的矢量量化增益),其计算复杂度约为ITU-T G.728 LD-CELP的75%。
{"title":"Trellis code excited linear prediction (TCELP) speech coding","authors":"Cheng-Chieh Lee, Y. Shoham","doi":"10.1109/SCFT.1999.781500","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781500","url":null,"abstract":"This paper describes using the trellis-based scalar-vector quantizer for sources with memory to solve the excitation codebook search problem of code excited linear prediction (CELP) speech coders. This approach leads to a 24 kbit/s telephony-bandwidth low-delay (3 msec) trellis CELP coder, which outperforms both ITU-T 15 kbit/s G.728 LD-CELP and G.726 32 kbit/s ADPCM. Since the codebook is derived from a scalar alphabet, the proposed coder can effectively handle excitation vectors in the 24-dimensional space (to realize considerable vector quantization gains) and has a computational complexity of approximately 75% of that of ITU-T G.728 LD-CELP.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123715117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Perceptual zerotrees for scalable wavelet coding of wideband audio 宽带音频可扩展小波编码的感知零树
A. Aggarwal, V. Cuperman, K. Rose, A. Gersho
This paper introduces a new algorithm for scalable coding of wideband audio signals. The technique is based on quantization of bi-orthogonal wavelet transformed coefficients using a perceptual zerotree method. An initial zerotree estimate of the wavelet coefficients is computed, followed by scalar quantization of the coefficients according to perceptual thresholds. The choice of wavelet decomposition and encoding parameters for each frame is adapted to the source characteristics employing a rate distortion criterion. The scalability of the coder is due to the tree structure, which enables graceful degradation with decrease in bit rate. Preliminary subjective tests indicate near-transparent quality for average bit rates in the range of 1.5 to 2.5 bits per sample.
本文介绍了一种宽带音频信号可扩展编码的新算法。该技术基于感知零树方法对双正交小波变换系数进行量化。计算小波系数的初始零树估计,然后根据感知阈值对系数进行标量量化。每帧的小波分解和编码参数的选择是根据信源的特征,采用速率失真准则。编码器的可扩展性是由于树形结构,它可以随着比特率的降低而优雅地降级。初步主观测试表明,每个样本的平均比特率在1.5到2.5比特之间,质量接近透明。
{"title":"Perceptual zerotrees for scalable wavelet coding of wideband audio","authors":"A. Aggarwal, V. Cuperman, K. Rose, A. Gersho","doi":"10.1109/SCFT.1999.781469","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781469","url":null,"abstract":"This paper introduces a new algorithm for scalable coding of wideband audio signals. The technique is based on quantization of bi-orthogonal wavelet transformed coefficients using a perceptual zerotree method. An initial zerotree estimate of the wavelet coefficients is computed, followed by scalar quantization of the coefficients according to perceptual thresholds. The choice of wavelet decomposition and encoding parameters for each frame is adapted to the source characteristics employing a rate distortion criterion. The scalability of the coder is due to the tree structure, which enables graceful degradation with decrease in bit rate. Preliminary subjective tests indicate near-transparent quality for average bit rates in the range of 1.5 to 2.5 bits per sample.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128675490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques 采用混合ACELP/TCX技术的16/24/32 kbit/s宽带语音和音频编解码器
B. Bessette, R. Salami, C. Laflamme, R. Lefebvre
A hybrid ACELP/TCX algorithm for coding speech and music signals at 16, 24, and 32 kbit/s is presented. The algorithm switches between algebraic code excited linear prediction (ACELP) and transform coded excitation (TCX) modes on a 20-ms frame basis. Applying TCX on 20 ms frames improved the quality for music signals. Special care was taken to alleviate the switching artifacts between the two modes resulting in a transparent switching process. Subjective test results showed that for speech signals, the performance at 16, 24, and 32 kbit/s, is equivalent to G.722 at 48, 56, and 64 kbit/s, respectively. For music signals, the quality at 24 kbit/s was found equivalent to G.722 at 56 kbit/s. However, at 16 kbit/s, the quality for music was slightly lower than G.722 at 48 kbit/s.
提出了一种用于语音和音乐信号16、24和32 kbit/s编码的ACELP/TCX混合算法。该算法在代数编码激励线性预测(ACELP)和变换编码激励(TCX)模式之间以20ms帧为基础进行切换。在20ms帧上应用TCX提高了音乐信号的质量。特别注意减轻两种模式之间的切换工件,从而实现透明的切换过程。主观测试结果表明,对于语音信号,在16、24、32 kbit/s下的性能与G.722在48、56、64 kbit/s下的性能相当。对于音乐信号,24 kbit/s的质量相当于56 kbit/s的G.722。然而,在16 kbit/s时,音乐的质量略低于48 kbit/s时的G.722。
{"title":"A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques","authors":"B. Bessette, R. Salami, C. Laflamme, R. Lefebvre","doi":"10.1109/SCFT.1999.781466","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781466","url":null,"abstract":"A hybrid ACELP/TCX algorithm for coding speech and music signals at 16, 24, and 32 kbit/s is presented. The algorithm switches between algebraic code excited linear prediction (ACELP) and transform coded excitation (TCX) modes on a 20-ms frame basis. Applying TCX on 20 ms frames improved the quality for music signals. Special care was taken to alleviate the switching artifacts between the two modes resulting in a transparent switching process. Subjective test results showed that for speech signals, the performance at 16, 24, and 32 kbit/s, is equivalent to G.722 at 48, 56, and 64 kbit/s, respectively. For music signals, the quality at 24 kbit/s was found equivalent to G.722 at 56 kbit/s. However, at 16 kbit/s, the quality for music was slightly lower than G.722 at 48 kbit/s.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124083253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Recovery of speech spectral parameters using convex set projection 基于凸集投影的语音频谱参数恢复
U. Visitkitjakarn, W. Chan, Yongyi Yang
Previous works have demonstrated that by preserving speech spectral "dynamics" during spectral parameter quantization and/or decoding, the quality of coded speech can be improved. We explore the use of projections onto convex sets (POCS) techniques to recover speech spectral parameters from their quantized versions. Unlike prior works, the POCS approach enables us to obtain solutions that satisfy precise constraints. Two constraint sets are used in our POCS recovery algorithm: one set constrains the "roughness" of the parameter trajectories, and the other set confines the parameters to the proper quantizer partition cells. Simulation of our algorithm has consistently produced improvements in both the subjective quality and objective distortion measurements.
先前的研究表明,通过在频谱参数量化和/或解码过程中保持语音频谱的“动态”,可以提高编码语音的质量。我们探索使用凸集投影(POCS)技术从语音谱参数的量化版本中恢复语音谱参数。与先前的工作不同,POCS方法使我们能够获得满足精确约束的解。在我们的POCS恢复算法中使用了两个约束集:一组约束参数轨迹的“粗糙度”,另一组将参数限制在适当的量化划分单元中。我们的算法的模拟持续地在主观质量和客观失真测量方面产生改进。
{"title":"Recovery of speech spectral parameters using convex set projection","authors":"U. Visitkitjakarn, W. Chan, Yongyi Yang","doi":"10.1109/SCFT.1999.781475","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781475","url":null,"abstract":"Previous works have demonstrated that by preserving speech spectral \"dynamics\" during spectral parameter quantization and/or decoding, the quality of coded speech can be improved. We explore the use of projections onto convex sets (POCS) techniques to recover speech spectral parameters from their quantized versions. Unlike prior works, the POCS approach enables us to obtain solutions that satisfy precise constraints. Two constraint sets are used in our POCS recovery algorithm: one set constrains the \"roughness\" of the parameter trajectories, and the other set confines the parameters to the proper quantizer partition cells. Simulation of our algorithm has consistently produced improvements in both the subjective quality and objective distortion measurements.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133471527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Integration of speech enhancement and coding techniques 集成语音增强和编码技术
M. Kuropatwinski, D. Leckschat, K. Kroschel, A. Czyżewski
Speech coding techniques commonly used in low bit rate analysis-by-synthesis linear predictive coders (LPAS coders) can serve as a speech signal model emphasizing its important features. In the paper it is shown how this coding method can be utilized for speech enhancement. Particularly, the speech signal is modeled as the output of a cascade of an adaptive formant filter and a pitch filter, driven by a white Gaussian process with variance changing with time. A signal estimation method based on the Kalman filter is investigated which implements this speech signal model. The proposed approach yields significantly better performance both in SNR and subjective impression than Kalman filter methods, which use only short-time speech parameters.
低比特率合成分析线性预测编码器(LPAS编码器)中常用的语音编码技术可以作为语音信号模型,强调其重要特性。本文展示了如何利用这种编码方法进行语音增强。特别地,语音信号被建模为自适应形成峰滤波器和基音滤波器级联的输出,由方差随时间变化的高斯白过程驱动。研究了一种基于卡尔曼滤波的语音信号估计方法,实现了该语音信号模型。与仅使用短时语音参数的卡尔曼滤波方法相比,该方法在信噪比和主观印象方面都有显著提高。
{"title":"Integration of speech enhancement and coding techniques","authors":"M. Kuropatwinski, D. Leckschat, K. Kroschel, A. Czyżewski","doi":"10.1109/SCFT.1999.781520","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781520","url":null,"abstract":"Speech coding techniques commonly used in low bit rate analysis-by-synthesis linear predictive coders (LPAS coders) can serve as a speech signal model emphasizing its important features. In the paper it is shown how this coding method can be utilized for speech enhancement. Particularly, the speech signal is modeled as the output of a cascade of an adaptive formant filter and a pitch filter, driven by a white Gaussian process with variance changing with time. A signal estimation method based on the Kalman filter is investigated which implements this speech signal model. The proposed approach yields significantly better performance both in SNR and subjective impression than Kalman filter methods, which use only short-time speech parameters.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114782115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multi-rate wideband speech/channel codec based on MPEG-4/CELP for ETSI/GSM full-rate channel 基于MPEG-4/CELP的多速率宽带语音/信道编解码器,用于ETSI/GSM全速率信道
A. Murashima, M. Serizawa, K. Ozawa
This paper proposes a wideband multi-rate speech and channel codec based on the MPEG-4/CELP for the ETSI/GSM full-rate channel. In order to improve coding performance under mobile environments, such as channel error and background noise, the proposed codec operates at three bit allocations between speech and channel coding with a constant gross bit-rate of 22.8 kbit/s. The speech coding bit-rates are 10.9, 12.1 and 15.9 kbit/s. It achieves high speech quality under any channel condition by switching the bit allocations and also for noisy speech by using the highest bit-rate. The preliminary subjective evaluation tests show the speech quality is improved by switching the bit allocation under error conditions. It is also comparable of superior to ITU-T Recommendation G.722 48 kbit/s for carrier-to-interference ratios (C/I) higher than 10 dB. The codec at 15.9 kbit/s also gives comparable speech quality to G.722 at 48 kbit/s under background noise conditions.
针对ETSI/GSM全速率信道,提出了一种基于MPEG-4/CELP的宽带多速率语音信道编解码器。为了提高在移动环境下的编码性能,如信道错误和背景噪声,提出的编解码器在语音和信道编码之间进行3位分配,总比特率为22.8 kbit/s。语音编码码率分别为10.9、12.1和15.9 kbit/s。它通过切换比特分配在任何信道条件下都能获得高的语音质量,并且通过使用最高比特率来处理有噪声的语音。初步的主观评价测试表明,在错误条件下切换比特分配可以提高语音质量。对于高于10db的载波干扰比(C/I),它也可与ITU-T建议G.722 48 kbit/s相比。在背景噪声条件下,15.9 kbit/s的编解码器也可以提供与48 kbit/s的G.722相当的语音质量。
{"title":"Multi-rate wideband speech/channel codec based on MPEG-4/CELP for ETSI/GSM full-rate channel","authors":"A. Murashima, M. Serizawa, K. Ozawa","doi":"10.1109/SCFT.1999.781470","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781470","url":null,"abstract":"This paper proposes a wideband multi-rate speech and channel codec based on the MPEG-4/CELP for the ETSI/GSM full-rate channel. In order to improve coding performance under mobile environments, such as channel error and background noise, the proposed codec operates at three bit allocations between speech and channel coding with a constant gross bit-rate of 22.8 kbit/s. The speech coding bit-rates are 10.9, 12.1 and 15.9 kbit/s. It achieves high speech quality under any channel condition by switching the bit allocations and also for noisy speech by using the highest bit-rate. The preliminary subjective evaluation tests show the speech quality is improved by switching the bit allocation under error conditions. It is also comparable of superior to ITU-T Recommendation G.722 48 kbit/s for carrier-to-interference ratios (C/I) higher than 10 dB. The codec at 15.9 kbit/s also gives comparable speech quality to G.722 at 48 kbit/s under background noise conditions.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123401849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
BEC++: a software tool for increased flexibility in algorithm development BEC++:增加算法开发灵活性的软件工具
M. Harton, K. Kapuscinski
Sometimes, there is little interest by algorithm developers in creating a fixed-point simulation from a floating-point algorithm. However, often it is vital that high levels of speech quality be maintained in a fixed-point application. The process of converting floating-point simulations to fixed-point is time consuming, expensive, and if not done well, a state-of-the-art algorithm may never see product implementation. There is a critical need for software tools that reduce the time and effort that algorithm developers spend on floating-point to fixed-point software conversion. Bit-Exact C++ (BEC++) is just such a tool. This paper discusses a fixed-point software implementation tool, BEC++, with syntax similar in look and feel to that of floating-point C. Based on the ETSI Bit-Exact C (BEC) software now commonly used in industry, BEC++ extends the capabilities of BEC through the introduction of C++ language features and object-oriented techniques. This paper also details how to use the software, providing comparisons between BEC++ and BEC implementations.
有时,算法开发人员对从浮点算法创建定点模拟不太感兴趣。然而,在定点应用程序中保持高水平的语音质量通常是至关重要的。将浮点模拟转换为定点模拟的过程既耗时又昂贵,如果做得不好,最先进的算法可能永远不会看到产品实现。迫切需要能够减少算法开发人员在浮点到定点软件转换上花费的时间和精力的软件工具。位精确c++ (BEC++)就是这样一个工具。本文讨论了一种与浮点C语法相似的语法实现工具——BEC++。BEC++在目前工业上普遍使用的ETSI Bit-Exact C (BEC)软件的基础上,通过引入c++语言特性和面向对象技术,扩展了BEC的功能。本文还详细介绍了该软件的使用方法,并对BEC++和BEC实现进行了比较。
{"title":"BEC++: a software tool for increased flexibility in algorithm development","authors":"M. Harton, K. Kapuscinski","doi":"10.1109/SCFT.1999.781486","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781486","url":null,"abstract":"Sometimes, there is little interest by algorithm developers in creating a fixed-point simulation from a floating-point algorithm. However, often it is vital that high levels of speech quality be maintained in a fixed-point application. The process of converting floating-point simulations to fixed-point is time consuming, expensive, and if not done well, a state-of-the-art algorithm may never see product implementation. There is a critical need for software tools that reduce the time and effort that algorithm developers spend on floating-point to fixed-point software conversion. Bit-Exact C++ (BEC++) is just such a tool. This paper discusses a fixed-point software implementation tool, BEC++, with syntax similar in look and feel to that of floating-point C. Based on the ETSI Bit-Exact C (BEC) software now commonly used in industry, BEC++ extends the capabilities of BEC through the introduction of C++ language features and object-oriented techniques. This paper also details how to use the software, providing comparisons between BEC++ and BEC implementations.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127475506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple-description coding (MDC) of speech with an invertible auditory model 基于可逆听觉模型的语音多描述编码
G. Kubin, W. Kleijn
Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides an implicit forward error concealment mechanism to handle random erasures of the channel. To this end, the individual acoustic subchannels of our auditory model are grouped into different transport subchannels or packets. Due to the strongly overlapping, redundant filterbank structure of the model, reconstruction of speech without audible degradation becomes possible even if a significant percentage of channels is erased (e.g., up to 40% in a 50-channel auditory model for narrowband speech). We discuss this result both from a hearing-physiology and a frame-theoretic perspective.
网络信号处理方面在语音和音频编码应用中占主导地位,如互联网电话或分组无线网络。我们证明了我们在感知域的语音编码方法提供了一种隐式的前向错误隐藏机制来处理信道的随机擦除。为此,我们的听觉模型的单个声学子通道被分组到不同的传输子通道或数据包中。由于模型的强重叠冗余滤波器组结构,即使有很大比例的通道被擦除(例如,在50通道的窄带语音听觉模型中高达40%),也可以在没有听觉退化的情况下重建语音。我们从听觉生理学和框架理论的角度来讨论这一结果。
{"title":"Multiple-description coding (MDC) of speech with an invertible auditory model","authors":"G. Kubin, W. Kleijn","doi":"10.1109/SCFT.1999.781491","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781491","url":null,"abstract":"Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides an implicit forward error concealment mechanism to handle random erasures of the channel. To this end, the individual acoustic subchannels of our auditory model are grouped into different transport subchannels or packets. Due to the strongly overlapping, redundant filterbank structure of the model, reconstruction of speech without audible degradation becomes possible even if a significant percentage of channels is erased (e.g., up to 40% in a 50-channel auditory model for narrowband speech). We discuss this result both from a hearing-physiology and a frame-theoretic perspective.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126713520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A low bit rate codec for AMR standard AMR标准的低比特率编解码器
M. Foodeei, H. Zarrinkoub, R. Matmti, R. Rabipour, F. Gabin, S. Gosne
We describe a low bit rate speech codec based on the RCELP paradigm and designed as a candidate for GSM-AMR. The relaxation of the waveform-matching constraint in the RCELP model allows for reducing the bit rate without affecting the speech quality. New efficient quantization methods for the LSF and gain parameters coupled with some algorithmic improvements result in a high quality speech codec at bit rates as low as 4.55 kbit/s. Subjective tests show encouraging results in terms of quality and robustness under various operating conditions.
我们描述了一种基于RCELP范式的低比特率语音编解码器,并设计为GSM-AMR的候选方案。RCELP模型中波形匹配约束的放松允许在不影响语音质量的情况下降低比特率。新的有效量化LSF和增益参数的方法,加上一些算法的改进,导致高质量的语音编解码器,比特率低至4.55 kbit/s。主观测试在各种操作条件下的质量和稳健性方面显示出令人鼓舞的结果。
{"title":"A low bit rate codec for AMR standard","authors":"M. Foodeei, H. Zarrinkoub, R. Matmti, R. Rabipour, F. Gabin, S. Gosne","doi":"10.1109/SCFT.1999.781505","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781505","url":null,"abstract":"We describe a low bit rate speech codec based on the RCELP paradigm and designed as a candidate for GSM-AMR. The relaxation of the waveform-matching constraint in the RCELP model allows for reducing the bit rate without affecting the speech quality. New efficient quantization methods for the LSF and gain parameters coupled with some algorithmic improvements result in a high quality speech codec at bit rates as low as 4.55 kbit/s. Subjective tests show encouraging results in terms of quality and robustness under various operating conditions.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126866882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1