{"title":"Towards a Non-Intrusive Context-Aware Speech Quality Model","authors":"R. Jaiswal, Andrew Hines","doi":"10.1109/ISSC49989.2020.9180171","DOIUrl":null,"url":null,"abstract":"Understanding how humans judge perceived speech quality while interacting through Voice over Internet Protocol (VoIP) applications in real-time is essential to build a robust and accurate speech quality prediction model. Speech quality is degraded in the presence of background noise reducing the Quality of Experience (QoE). Speech Enhancement (SE) algorithms can improve speech quality in noisy environments. The publicly available NOIZEUS speech corpus contains speech in environmental background noise babble, car, street, and train at two Signal-to-noise ratio (SNRs) 5dB and 10dB. Objective Speech Quality Metrics (OSQM) are used to monitor and measure speech quality for VoIP applications. This paper proposes a Context-aware QoE prediction model, CAQoE, which classifies the speech signal context (i.e., noise type and SNR) in order to allow context-specific speech quality prediction. This paper presents experiments conducted to develop the speech context-classification component of the proposed CAQoE model. Speech enhancement algorithms are used in conjunction with an OSQM to estimate Mean Opinion Score (MOS) of noisy and enhanced samples in order to train Machine Learning (ML) classifiers to classify the speech signal context (i.e., noise type and SNR). Results demonstrate that a Decision Tree (DT) classifier has better classification accuracy for the noise classes tested. We present the associated components of the CAQoE model, namely; Voice Activity Detection (VAD) and Speech Quality Model (SQM).","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 31st Irish Signals and Systems Conference (ISSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSC49989.2020.9180171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Understanding how humans judge perceived speech quality while interacting through Voice over Internet Protocol (VoIP) applications in real-time is essential to build a robust and accurate speech quality prediction model. Speech quality is degraded in the presence of background noise reducing the Quality of Experience (QoE). Speech Enhancement (SE) algorithms can improve speech quality in noisy environments. The publicly available NOIZEUS speech corpus contains speech in environmental background noise babble, car, street, and train at two Signal-to-noise ratio (SNRs) 5dB and 10dB. Objective Speech Quality Metrics (OSQM) are used to monitor and measure speech quality for VoIP applications. This paper proposes a Context-aware QoE prediction model, CAQoE, which classifies the speech signal context (i.e., noise type and SNR) in order to allow context-specific speech quality prediction. This paper presents experiments conducted to develop the speech context-classification component of the proposed CAQoE model. Speech enhancement algorithms are used in conjunction with an OSQM to estimate Mean Opinion Score (MOS) of noisy and enhanced samples in order to train Machine Learning (ML) classifiers to classify the speech signal context (i.e., noise type and SNR). Results demonstrate that a Decision Tree (DT) classifier has better classification accuracy for the noise classes tested. We present the associated components of the CAQoE model, namely; Voice Activity Detection (VAD) and Speech Quality Model (SQM).