T. Arakawa, Haitham Al-Hassanieh, M. Tsujikawa, R. Isotani
{"title":"Extended Minimum Classification Error Training in Voice Activity Detection","authors":"T. Arakawa, Haitham Al-Hassanieh, M. Tsujikawa, R. Isotani","doi":"10.1109/ASRU.2009.5373251","DOIUrl":null,"url":null,"abstract":"Voice Activity Detection (VAD) is a fundamental part of speech processing. Combination of multiple acoustic features is an effective approach to make VAD more robust against various noise conditions. There have been proposed several feature combination methods, in which weights for feature values are optimized based on Minimum Classification Error (MCE) training. We improve these MCE-based methods by introducing a novel discriminative function for whole frames. The proposed method optimizes combination weights taking into account the ratio between false acceptance and false rejection rates as well as the effect of the use of shaping procedures such as hangover.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Voice Activity Detection (VAD) is a fundamental part of speech processing. Combination of multiple acoustic features is an effective approach to make VAD more robust against various noise conditions. There have been proposed several feature combination methods, in which weights for feature values are optimized based on Minimum Classification Error (MCE) training. We improve these MCE-based methods by introducing a novel discriminative function for whole frames. The proposed method optimizes combination weights taking into account the ratio between false acceptance and false rejection rates as well as the effect of the use of shaping procedures such as hangover.