Gammatone wavelet Cepstral Coefficients for robust speech recognition

2013 IEEE International Conference of IEEE Region 10 (TENCON 2013) Pub Date : 2013-10-01 DOI:10.1109/TENCON.2013.6718948

A. Adiga, Mathew Magimai, C. Seelamantula

引用次数: 36

Abstract

We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

稳健语音识别的伽玛酮小波倒谱系数

我们开发噪声鲁棒性特征使用伽玛通小波衍生自流行的伽玛通函数。这些小波结合了人类外周听觉系统的特征，特别是基底膜的空间变化频率响应。我们将这些新特征称为伽马单小波倒谱系数(GWCC)。从语音信号中提取GWCC的过程与传统的mel -频率倒谱系数(MFCC)技术类似，不同之处在于所使用的滤波器组的类型。我们用Gammatone小波构造的Gammatone小波滤波器组取代了MFCC中传统的mel滤波器组。我们还探讨了基于Gammatone滤波器组的特征(Gammatone倒谱系数(GCC))对鲁棒语音识别的影响。在AURORA 2数据库上，对gwcc、gcc和mfccc进行了比较，结果表明基于Gammatone的特征在低信噪比下具有更好的识别性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE International Conference of IEEE Region 10 (TENCON 2013)

自引率

0.00%

发文量