{"title":"基于窗口的说话人识别特征提取算法综述","authors":"Genevieve M. Sapijaszko, W. Mikhael","doi":"10.1109/MWSCAS.2012.6292161","DOIUrl":null,"url":null,"abstract":"An important first step in speaker recognition is the extraction of unique and reliable features that can identify speakers from speech signals. Feature extraction methods have evolved in the last 20 years, with window frame algorithms in particular showing promise. This paper compares and contrasts recent window frames algorithms using the Center for Spoken Language Understanding (CLSU) database through experiments. The different coefficients used and compared are: Real Cepstral Coefficients (RCC), Mel Cepstral Coefficients (MFCC), Linear Predictive Cepstral Coefficients (LPCC), and Perceptual Linear Predictive Cepstral Coefficients (PLPCC). The feature extraction methods will be used in conjunction with a Vector Quantization (VQ) method and a Euclidean distance classifier to find the best recognition rate among the feature extraction features. A survey of published state-of-the-art, window-based, feature extraction methods are evaluated against published results.","PeriodicalId":324891,"journal":{"name":"2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"An overview of recent window based feature extraction algorithms for speaker recognition\",\"authors\":\"Genevieve M. Sapijaszko, W. Mikhael\",\"doi\":\"10.1109/MWSCAS.2012.6292161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An important first step in speaker recognition is the extraction of unique and reliable features that can identify speakers from speech signals. Feature extraction methods have evolved in the last 20 years, with window frame algorithms in particular showing promise. This paper compares and contrasts recent window frames algorithms using the Center for Spoken Language Understanding (CLSU) database through experiments. The different coefficients used and compared are: Real Cepstral Coefficients (RCC), Mel Cepstral Coefficients (MFCC), Linear Predictive Cepstral Coefficients (LPCC), and Perceptual Linear Predictive Cepstral Coefficients (PLPCC). The feature extraction methods will be used in conjunction with a Vector Quantization (VQ) method and a Euclidean distance classifier to find the best recognition rate among the feature extraction features. A survey of published state-of-the-art, window-based, feature extraction methods are evaluated against published results.\",\"PeriodicalId\":324891,\"journal\":{\"name\":\"2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS)\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MWSCAS.2012.6292161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MWSCAS.2012.6292161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An overview of recent window based feature extraction algorithms for speaker recognition
An important first step in speaker recognition is the extraction of unique and reliable features that can identify speakers from speech signals. Feature extraction methods have evolved in the last 20 years, with window frame algorithms in particular showing promise. This paper compares and contrasts recent window frames algorithms using the Center for Spoken Language Understanding (CLSU) database through experiments. The different coefficients used and compared are: Real Cepstral Coefficients (RCC), Mel Cepstral Coefficients (MFCC), Linear Predictive Cepstral Coefficients (LPCC), and Perceptual Linear Predictive Cepstral Coefficients (PLPCC). The feature extraction methods will be used in conjunction with a Vector Quantization (VQ) method and a Euclidean distance classifier to find the best recognition rate among the feature extraction features. A survey of published state-of-the-art, window-based, feature extraction methods are evaluated against published results.