{"title":"Speaker identification using hidden Markov models","authors":"M. Inman, D. Danforth, S. Hangai, K. Sato","doi":"10.1109/ICOSP.1998.770285","DOIUrl":null,"url":null,"abstract":"In this study, we show that the use of hidden Markov models (HMMs) significantly enhances the success rate of speaker identification over time. The segment boundary information derived from HMMs provides a means of normalizing the formant patterns obtained from a digital cochlear filter, which we also describe. The use of the digital cochlear filter and HMMs in our study was motivated by two well-known problems in speech recognition generally, i.e. phonetic tempo variability and variability over temporal units of a given length, typically days. We show how these problems can be minimized to achieve more robust speaker identification.","PeriodicalId":145700,"journal":{"name":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOSP.1998.770285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
In this study, we show that the use of hidden Markov models (HMMs) significantly enhances the success rate of speaker identification over time. The segment boundary information derived from HMMs provides a means of normalizing the formant patterns obtained from a digital cochlear filter, which we also describe. The use of the digital cochlear filter and HMMs in our study was motivated by two well-known problems in speech recognition generally, i.e. phonetic tempo variability and variability over temporal units of a given length, typically days. We show how these problems can be minimized to achieve more robust speaker identification.