一种基于增强混合激励线性预测的变比特率语音编码算法

Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau
{"title":"一种基于增强混合激励线性预测的变比特率语音编码算法","authors":"Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau","doi":"10.1109/CISP-BMEI.2016.7852841","DOIUrl":null,"url":null,"abstract":"In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.","PeriodicalId":275095,"journal":{"name":"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"A variable-bit-rate speech coding algorithm based on enhanced mixed excitation linear prediction\",\"authors\":\"Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau\",\"doi\":\"10.1109/CISP-BMEI.2016.7852841\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.\",\"PeriodicalId\":275095,\"journal\":{\"name\":\"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISP-BMEI.2016.7852841\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI.2016.7852841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

为了提高语音通信的信道带宽利用率,提出了一种基于增强混合激励线性预测(MELPe)的变比特率语音编码算法。在语音交流中,只有大约40%的时间是在说话,而其余的时间都被沉默或背景噪音所占据。此外,在低比特率语音编码算法中,非浊音帧通常比浊音帧需要更低的传输速率。因此,总是使用相同的编码码率进行语音编码是对信道资源的浪费。本文采用语音活动检测(VAD)技术,将输入信号分为语音信号和静音信号。语音帧分为浊音帧和非浊音帧。它们使用不同的编码速率进行语音编码和数据传输。所有的参数都在浊音帧中进行编码、传输和解码。在非浊音帧中,只有增益参数、LSF参数、音高参数和整体发声进行编码、传输和解码。在静默帧中,只有增益参数和一级LSF参数被编码、传输和解码。当通话占用约40%的时间时,与传统的2.4 kbps MELPe声码器相比,本文提出的可变比特率声码器的平均编码速率可以达到1.33 kbps。但它们可以达到与合成语音相同的质量。实验结果表明,该方法降低了平均编码率,合成背景噪声对主观听觉有较好的舒适性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A variable-bit-rate speech coding algorithm based on enhanced mixed excitation linear prediction
In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
D-admissible control of singular delta operator systems Performance comparison of two spread-spectrum-based wireless video transmission schemes Impact analysis on three-dimensional indoor location technology Formation of graphene oxide/graphene membrane on solid-state substrates via Langmuir-Blodgett self-assembly Design of a panorama parking system based on DM6437
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1