{"title":"Voice Activity Detection Algorithm with Low Signal-to-Noise Ratios Based on Spectrum Entropy","authors":"Kun-Ching Wang, Y. Tsai","doi":"10.1109/ISUC.2008.55","DOIUrl":null,"url":null,"abstract":"This letter presents a robust voice activity detection (VAD) algorithm for detecting voice activity in noisy environments. The presented robust VAD utilizes the entropy measurement defined in band-splitting spectrum domain to exploit the formant frequency representation as a highly efficient, compact representation of the time-varying characteristics of speech. Additionally, Teager energy operator (TEO) can be employed to provide a better representation of formant information resulting in high performance of classification of speech/non-speech priori to entropy-based measurement. The results show that the proposed algorithm has an overall better performance than the standard ITU-T G.729B VAD and Shen's entropy-based VAD.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Second International Symposium on Universal Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISUC.2008.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40
Abstract
This letter presents a robust voice activity detection (VAD) algorithm for detecting voice activity in noisy environments. The presented robust VAD utilizes the entropy measurement defined in band-splitting spectrum domain to exploit the formant frequency representation as a highly efficient, compact representation of the time-varying characteristics of speech. Additionally, Teager energy operator (TEO) can be employed to provide a better representation of formant information resulting in high performance of classification of speech/non-speech priori to entropy-based measurement. The results show that the proposed algorithm has an overall better performance than the standard ITU-T G.729B VAD and Shen's entropy-based VAD.