{"title":"Mel Spectrogram Based Automatic Speaker Verification Using GMM-UBM","authors":"T. Kumar, Ramesh Kumar Bhukya","doi":"10.1109/UPCON56432.2022.9986424","DOIUrl":null,"url":null,"abstract":"Speech recognition refers to the technology that enables machines to recognize persons using their speech utterances. An automatic speaker verification (ASV) is included in one of the challenging task in speech community. The ASV system works based on the speaker recognition claimed against the model. In this paper, the system works as a text-independent speaker verification (TISV) and is outlined to verify the speaker using his/her voice samples. We followed two approaches, first approach is Gaussian Mixture Model (GMM) method is used to create speaker modeling and the second approach are GMMs created from training dataset, with Universal Background Model (UBM) used for adaptation of the dataset, well known approach for speaker verification (SV). GMM-UBMs are designed as well classifier for decision making. In both the approaches, the training is performed by the Expectation Maximization (EM) and Maximum A Posteriori (MAP) adaptation for better models respectively. The NIST 2003 database is evaluated using adapted GMM-UBM following NIST 2003 speaker recognition evaluation protocol and the relative performance improvement in the SV system using GMM and GMM-UBM in terms of EER are 9.43% and 8.88%.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"72 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UPCON56432.2022.9986424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Speech recognition refers to the technology that enables machines to recognize persons using their speech utterances. An automatic speaker verification (ASV) is included in one of the challenging task in speech community. The ASV system works based on the speaker recognition claimed against the model. In this paper, the system works as a text-independent speaker verification (TISV) and is outlined to verify the speaker using his/her voice samples. We followed two approaches, first approach is Gaussian Mixture Model (GMM) method is used to create speaker modeling and the second approach are GMMs created from training dataset, with Universal Background Model (UBM) used for adaptation of the dataset, well known approach for speaker verification (SV). GMM-UBMs are designed as well classifier for decision making. In both the approaches, the training is performed by the Expectation Maximization (EM) and Maximum A Posteriori (MAP) adaptation for better models respectively. The NIST 2003 database is evaluated using adapted GMM-UBM following NIST 2003 speaker recognition evaluation protocol and the relative performance improvement in the SV system using GMM and GMM-UBM in terms of EER are 9.43% and 8.88%.