基于波形插值的源控可变比特率语音编码器

5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI:10.21437/ICSLP.1998-395

F. Plante, B. Cheetham, D. Marston, P. A. Barrett

{"title":"基于波形插值的源控可变比特率语音编码器","authors":"F. Plante, B. Cheetham, D. Marston, P. A. Barrett","doi":"10.21437/ICSLP.1998-395","DOIUrl":null,"url":null,"abstract":"This paper describes a source controlled variable bit-rate (SC-VBR) speech coder based on the concept of prototype waveform interpolation. The coder uses a four mode classification : silence, voiced, unvoiced and transition. These modes are detected after the speech has been decomposed into slowly evolving (SEW) and rapidly evolving (REW) waveforms. A voicing activity detection (VAD), the relative level of SEW and REW and the cross-correlation coefficient between characteristic waveform segments are used to make the classification. The encoding of the SEW components is improved using a gender adaptation. In tests using conversational speech, the SC-VBR allows a compression factor of around 3. The VBR coder was evaluated against a fixed rate 4.6kbit/s PWI coder for clean speech and noisy speech and was found to perform better for male speech and for noisy speech.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Source controlled variable bit-rate speech coder based on waveform interpolation\",\"authors\":\"F. Plante, B. Cheetham, D. Marston, P. A. Barrett\",\"doi\":\"10.21437/ICSLP.1998-395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a source controlled variable bit-rate (SC-VBR) speech coder based on the concept of prototype waveform interpolation. The coder uses a four mode classification : silence, voiced, unvoiced and transition. These modes are detected after the speech has been decomposed into slowly evolving (SEW) and rapidly evolving (REW) waveforms. A voicing activity detection (VAD), the relative level of SEW and REW and the cross-correlation coefficient between characteristic waveform segments are used to make the classification. The encoding of the SEW components is improved using a gender adaptation. In tests using conversational speech, the SC-VBR allows a compression factor of around 3. The VBR coder was evaluated against a fixed rate 4.6kbit/s PWI coder for clean speech and noisy speech and was found to perform better for male speech and for noisy speech.\",\"PeriodicalId\":117113,\"journal\":{\"name\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/ICSLP.1998-395\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

本文介绍了一种基于原型波形插值原理的源控可变比特率语音编码器。编码器使用四种模式分类:静音、浊音、静音和过渡。这些模式是在语音被分解成慢进化(SEW)和快速进化(REW)波形后检测到的。利用语音活动检测(VAD)、SEW和REW的相对水平以及特征波形段之间的互相关系数进行分类。使用性别适应改进了SEW组件的编码。在使用会话语音的测试中，SC-VBR允许的压缩系数约为3。VBR编码器与固定速率4.6kbit/s的PWI编码器对干净语音和嘈杂语音进行了评估，发现在男性语音和嘈杂语音中表现更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Source controlled variable bit-rate speech coder based on waveform interpolation

This paper describes a source controlled variable bit-rate (SC-VBR) speech coder based on the concept of prototype waveform interpolation. The coder uses a four mode classification : silence, voiced, unvoiced and transition. These modes are detected after the speech has been decomposed into slowly evolving (SEW) and rapidly evolving (REW) waveforms. A voicing activity detection (VAD), the relative level of SEW and REW and the cross-correlation coefficient between characteristic waveform segments are used to make the classification. The encoding of the SEW components is improved using a gender adaptation. In tests using conversational speech, the SC-VBR allows a compression factor of around 3. The VBR coder was evaluated against a fixed rate 4.6kbit/s PWI coder for clean speech and noisy speech and was found to perform better for male speech and for noisy speech.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

5th International Conference on Spoken Language Processing (ICSLP 1998)

自引率

0.00%

发文量

期刊最新文献

Assimilation of place in Japanese and dutch Articulatory analysis using a codebook for articulatory based low bit-rate speech coding Phonetic and phonological characteristics of paralinguistic information in spoken Japanese HMM-based visual speech recognition using intensity and location normalization Speech recognition via phonetically featured syllables