Performance estimation of noisy speech recognition using spectral distortion and SNR of noise-reduced speech

2013 IEEE International Conference of IEEE Region 10 (TENCON 2013) Pub Date : 2013-10-01 DOI:10.1109/TENCON.2013.6718993

Guo Ling, Takeshi Yamada, S. Makino, N. Kitawaki

{"title":"Performance estimation of noisy speech recognition using spectral distortion and SNR of noise-reduced speech","authors":"Guo Ling, Takeshi Yamada, S. Makino, N. Kitawaki","doi":"10.1109/TENCON.2013.6718993","DOIUrl":null,"url":null,"abstract":"To ensure a satisfactory QoE (Quality of Experience) and facilitate system design in speech recognition services, it is essential to establish a method that can be used to efficiently investigate recognition performance in different noise environments. Previously, we proposed a performance estimation method using the PESQ (Perceptual Evaluation of Speech Quality) as a spectral distortion measure. However, there is the problem that the relationship between the recognition performance and the distortion value differs depending on the noise reduction algorithm used. To solve this problem, we propose a novel performance estimation method that uses an estimator defined as a function of the distortion value and the SNR (Signal to Noise Ratio) of noise-reduced speech. The estimator is applicable to different noise reduction algorithms without any modification. We confirmed the effectiveness of the proposed method by experiments using the AURORA-2J connected digit recognition task and four different noise reduction algorithms.","PeriodicalId":425023,"journal":{"name":"2013 IEEE International Conference of IEEE Region 10 (TENCON 2013)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference of IEEE Region 10 (TENCON 2013)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2013.6718993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

To ensure a satisfactory QoE (Quality of Experience) and facilitate system design in speech recognition services, it is essential to establish a method that can be used to efficiently investigate recognition performance in different noise environments. Previously, we proposed a performance estimation method using the PESQ (Perceptual Evaluation of Speech Quality) as a spectral distortion measure. However, there is the problem that the relationship between the recognition performance and the distortion value differs depending on the noise reduction algorithm used. To solve this problem, we propose a novel performance estimation method that uses an estimator defined as a function of the distortion value and the SNR (Signal to Noise Ratio) of noise-reduced speech. The estimator is applicable to different noise reduction algorithms without any modification. We confirmed the effectiveness of the proposed method by experiments using the AURORA-2J connected digit recognition task and four different noise reduction algorithms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于频谱失真和降噪语音信噪比的噪声语音识别性能估计

为了确保语音识别服务获得满意的QoE(体验质量)并促进系统设计，建立一种可用于有效研究不同噪声环境下识别性能的方法至关重要。之前，我们提出了一种使用PESQ(语音质量感知评价)作为频谱失真度量的性能估计方法。然而，存在的问题是，识别性能与失真值之间的关系取决于所使用的降噪算法。为了解决这个问题，我们提出了一种新的性能估计方法，该方法使用一个定义为失真值和降噪语音信噪比的函数的估计器。该估计器适用于不同的降噪算法，无需任何修改。我们通过使用AURORA-2J连接数字识别任务和四种不同的降噪算法验证了所提出方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE International Conference of IEEE Region 10 (TENCON 2013)

自引率

0.00%

发文量