A propagation approach to modelling the joint distributions of clean and corrupted speech in the Mel-Cepstral domain

2013 IEEE Workshop on Automatic Speech Recognition and Understanding Pub Date : 2013-12-01 DOI:10.1109/ASRU.2013.6707726

Ramón Fernández Astudillo

引用次数: 2

Abstract

This paper presents a closed form solution relating the joint distributions of corrupted and clean speech in the short-time Fourier Transform (STFT) and Mel-Frequency Cepstral Coefficient (MFCC) domains. This makes possible a tighter integration of STFT domain speech enhancement and feature and model-compensation techniques for robust automatic speech recognition. The approach directly utilizes the conventional speech distortion model for STFT speech enhancement, allowing for low cost, single pass, causal implementations. Compared to similar uncertainty propagation approaches, it provides the full joint distribution, rather than just the posterior distribution, which provides additional model compensation possibilities. The method is exemplified by deriving an MMSE-MFCC estimator from the propagated joint distribution. It is shown that similar performance to that of STFT uncertainty propagation (STFT-UP) can be obtained on the AURORA4, while deriving the full joint distribution.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种在mel -倒谱域模拟干净和损坏语音联合分布的传播方法

本文给出了一种关于短时间傅里叶变换(STFT)和mel -频率倒谱系数(MFCC)域中损坏语音和干净语音联合分布的封闭解。这使得STFT域语音增强和特征和模型补偿技术的更紧密集成成为可能，以实现鲁棒自动语音识别。该方法直接利用传统的语音失真模型进行STFT语音增强，允许低成本，单次通过，因果实现。与类似的不确定性传播方法相比，它提供了完整的联合分布，而不仅仅是后验分布，这提供了额外的模型补偿可能性。通过从传播联合分布中推导出MMSE-MFCC估计量，对该方法进行了验证。结果表明，在获得全联合分布的同时，在AURORA4上可以获得与STFT不确定性传播(STFT- up)相似的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

自引率

0.00%

发文量