{"title":"A Modified Multi-Feature Voiced/Unvoiced Speech Classification Method","authors":"R. Cai","doi":"10.1109/APPED.2010.25","DOIUrl":null,"url":null,"abstract":"A modified multi-feature voiced/unvoiced speech classification method is presented. The method is based on statistical analysis of wavelet-based frequency distribution of the average energy, zero-crossing rate, and average energy of short-time segments of the speech signal. The method first classifies the input speech into voiced, unvoiced and uncertain parts by comparing features with predetermined thresholds. Then, the uncertain parts are treated in three conditions and the boundary between voiced and unvoiced speech parts is determined by the average energy feature. The performance of the method has been evaluated using a large speech database. The method is shown to perform well in the cases of both clean and noise-degraded speech.","PeriodicalId":129691,"journal":{"name":"2010 Asia-Pacific Conference on Power Electronics and Design","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Asia-Pacific Conference on Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APPED.2010.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
A modified multi-feature voiced/unvoiced speech classification method is presented. The method is based on statistical analysis of wavelet-based frequency distribution of the average energy, zero-crossing rate, and average energy of short-time segments of the speech signal. The method first classifies the input speech into voiced, unvoiced and uncertain parts by comparing features with predetermined thresholds. Then, the uncertain parts are treated in three conditions and the boundary between voiced and unvoiced speech parts is determined by the average energy feature. The performance of the method has been evaluated using a large speech database. The method is shown to perform well in the cases of both clean and noise-degraded speech.