Voicing Class Dependent Huffman Coding of Compressed Front-End Feature Vector for Distributed Speech Recognition

Deok Su Kim, H. Kim
{"title":"Voicing Class Dependent Huffman Coding of Compressed Front-End Feature Vector for Distributed Speech Recognition","authors":"Deok Su Kim, H. Kim","doi":"10.1109/FGCNS.2008.44","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an entropy coding method to further compress quantized mel-frequency cepstral coefficients (MFCCs) extracted for distributed speech recognition (DSR). In the ETSI extended DSR standard, MFCCs are compressed with additional parameters such as pitch and voicing class. It is observed that the distribution of MFCCs varies according to the voicing class, thereby enabling the design of different Huffman trees for MFCCs according to voicing class. Based on this observation, we could further reduce the bit-rates of compressed MFCCs compared to the Huffman coding method that does not consider voicing class. Subsequent experiments show that the bit-rate of the proposed method is 34.18 bits per frame, which is 1.84 bits/frame lower than that of the Huffman coding method without voicing.","PeriodicalId":370780,"journal":{"name":"2008 Second International Conference on Future Generation Communication and Networking Symposia","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Second International Conference on Future Generation Communication and Networking Symposia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FGCNS.2008.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this paper, we propose an entropy coding method to further compress quantized mel-frequency cepstral coefficients (MFCCs) extracted for distributed speech recognition (DSR). In the ETSI extended DSR standard, MFCCs are compressed with additional parameters such as pitch and voicing class. It is observed that the distribution of MFCCs varies according to the voicing class, thereby enabling the design of different Huffman trees for MFCCs according to voicing class. Based on this observation, we could further reduce the bit-rates of compressed MFCCs compared to the Huffman coding method that does not consider voicing class. Subsequent experiments show that the bit-rate of the proposed method is 34.18 bits per frame, which is 1.84 bits/frame lower than that of the Huffman coding method without voicing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分布式语音识别中基于语音类的压缩前端特征向量霍夫曼编码
本文提出了一种熵编码方法,对分布式语音识别(DSR)中提取的量化梅尔频倒谱系数(MFCCs)进行进一步压缩。在ETSI扩展的DSR标准中,mfc被压缩了额外的参数,如音高和声音等级。观察到mfccc的分布随语音类的不同而不同,从而可以根据语音类设计不同的mfccc的Huffman树。基于这一观察,与不考虑语音类的霍夫曼编码方法相比,我们可以进一步降低压缩mfc的比特率。随后的实验表明,该方法的码率为34.18比特/帧,比不带语音的霍夫曼编码方法的码率低1.84比特/帧。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Approach to Event Recognition for Visual Surveillance Systems Lossless Information Hiding Scheme Based on Neighboring Correlation HSV Color Space and Face Detection Based Objectionable Image Detecting User Interface Concurrency in Web Service Client Systems Visuo-Motor Coordination in Bipedal Humanoid Robot Walking
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1