基于幂函数的鲁棒语音识别功率分布归一化算法

Chanwoo Kim, R. Stern
{"title":"基于幂函数的鲁棒语音识别功率分布归一化算法","authors":"Chanwoo Kim, R. Stern","doi":"10.1109/ASRU.2009.5373233","DOIUrl":null,"url":null,"abstract":"A novel algorithm that normalizes the distribution of spectral power coefficients is described in this paper. The algorithm, called power-function-based power distribution (PPDN) is based on the observation that the ratio of arithmetic mean to geometric mean changes as speech is corrupted by noise, and a parametric power function is used to equalize this ratio. We also observe that a longer “medium-duration” observation window (of approximately 100 ms) is better suited for parameter estimation for noise compensation than the briefer window typically used for automatic speech recognition. We also describe the implementation of an online version of PPDN based on exponentially weighted temporal averaging. Experimental results shows that PPDN provides comparable or slightly better results than state of- the-art algorithms such as vector Taylor series for speech recognition while requiring much less computation. Hence, the algorithm is suitable for both real-time speech communication or as a real-time preprocessing stage for speech recognition systems.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Power function-based power distribution normalization algorithm for robust speech recognition\",\"authors\":\"Chanwoo Kim, R. Stern\",\"doi\":\"10.1109/ASRU.2009.5373233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel algorithm that normalizes the distribution of spectral power coefficients is described in this paper. The algorithm, called power-function-based power distribution (PPDN) is based on the observation that the ratio of arithmetic mean to geometric mean changes as speech is corrupted by noise, and a parametric power function is used to equalize this ratio. We also observe that a longer “medium-duration” observation window (of approximately 100 ms) is better suited for parameter estimation for noise compensation than the briefer window typically used for automatic speech recognition. We also describe the implementation of an online version of PPDN based on exponentially weighted temporal averaging. Experimental results shows that PPDN provides comparable or slightly better results than state of- the-art algorithms such as vector Taylor series for speech recognition while requiring much less computation. Hence, the algorithm is suitable for both real-time speech communication or as a real-time preprocessing stage for speech recognition systems.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5373233\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

本文提出了一种新的谱功率系数分布归一化算法。该算法被称为基于幂函数的功率分布(PPDN),它基于对算术平均值与几何平均值的比值随着语音被噪声破坏而变化的观察,并使用参数幂函数来平衡该比值。我们还观察到,较长的“中等持续时间”观察窗口(大约100毫秒)比通常用于自动语音识别的较短窗口更适合于噪声补偿的参数估计。我们还描述了基于指数加权时间平均的PPDN在线版本的实现。实验结果表明,PPDN提供了与矢量泰勒级数等最先进的语音识别算法相当或稍好的结果,同时所需的计算量要少得多。因此,该算法既适用于实时语音通信,也适用于语音识别系统的实时预处理阶段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Power function-based power distribution normalization algorithm for robust speech recognition
A novel algorithm that normalizes the distribution of spectral power coefficients is described in this paper. The algorithm, called power-function-based power distribution (PPDN) is based on the observation that the ratio of arithmetic mean to geometric mean changes as speech is corrupted by noise, and a parametric power function is used to equalize this ratio. We also observe that a longer “medium-duration” observation window (of approximately 100 ms) is better suited for parameter estimation for noise compensation than the briefer window typically used for automatic speech recognition. We also describe the implementation of an online version of PPDN based on exponentially weighted temporal averaging. Experimental results shows that PPDN provides comparable or slightly better results than state of- the-art algorithms such as vector Taylor series for speech recognition while requiring much less computation. Hence, the algorithm is suitable for both real-time speech communication or as a real-time preprocessing stage for speech recognition systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detection of OOV words by combining acoustic confidence measures with linguistic features Automatic translation from parallel speech: Simultaneous interpretation as MT training data Local and global models for spontaneous speech segment detection and characterization Automatic punctuation generation for speech Response timing generation and response type selection for a spontaneous spoken dialog system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1