Comparison of I-vector and GMM-UBM approaches to speaker identification with TIMIT and NIST 2008 databases in challenging environments

Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers
{"title":"Comparison of I-vector and GMM-UBM approaches to speaker identification with TIMIT and NIST 2008 databases in challenging environments","authors":"Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers","doi":"10.23919/eusipco.2017.8081264","DOIUrl":null,"url":null,"abstract":"In this paper, two models, the I-vector and the Gaussian Mixture Model-Universal Background Model (GMM-UBM), are compared for the speaker identification task. Four feature combinations of I-vectors with seven fusion techniques are considered: maximum, mean, weighted sum, cumulative, interleaving and concatenated for both two and four features. In addition, an Extreme Learning Machine (ELM) is exploited to identify speakers, and then Speaker Identification Accuracy (SIA) is calculated. Both systems are evaluated for 120 speakers from the TIMIT and NIST 2008 databases for clean speech. Furthermore, a comprehensive evaluation is made under Additive White Gaussian Noise (AWGN) conditions and with three types of Non Stationary Noise (NSN), both with and without handset effects for the TIMIT database. The results show that the I-vector approach is better than the GMM-UBM for both clean and AWGN conditions without a handset. However, the GMM-UBM had better accuracy for NSN types.","PeriodicalId":346811,"journal":{"name":"2017 25th European Signal Processing Conference (EUSIPCO)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 25th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/eusipco.2017.8081264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

In this paper, two models, the I-vector and the Gaussian Mixture Model-Universal Background Model (GMM-UBM), are compared for the speaker identification task. Four feature combinations of I-vectors with seven fusion techniques are considered: maximum, mean, weighted sum, cumulative, interleaving and concatenated for both two and four features. In addition, an Extreme Learning Machine (ELM) is exploited to identify speakers, and then Speaker Identification Accuracy (SIA) is calculated. Both systems are evaluated for 120 speakers from the TIMIT and NIST 2008 databases for clean speech. Furthermore, a comprehensive evaluation is made under Additive White Gaussian Noise (AWGN) conditions and with three types of Non Stationary Noise (NSN), both with and without handset effects for the TIMIT database. The results show that the I-vector approach is better than the GMM-UBM for both clean and AWGN conditions without a handset. However, the GMM-UBM had better accuracy for NSN types.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
I-vector和GMM-UBM方法在挑战性环境下与TIMIT和NIST 2008数据库的说话人识别比较
本文将i向量模型和高斯混合模型-通用背景模型(GMM-UBM)两种模型用于说话人识别任务的比较。考虑了i向量的四种特征组合和七种融合技术:最大值、平均值、加权和、累加、交织和连接两个和四个特征。此外,利用极限学习机(ELM)识别说话人,并计算说话人识别精度(SIA)。这两个系统都对来自TIMIT和NIST 2008数据库的120名说话者进行了干净语音评估。此外,在加性高斯白噪声(AWGN)条件下,对TIMIT数据库进行了三种类型的非平稳噪声(NSN)的综合评价,包括有和没有手机效应。结果表明,在清洁和无手机的AWGN条件下,i向量方法都优于GMM-UBM方法。然而,GMM-UBM对NSN类型具有更好的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Image deblurring using a perturbation-basec regularization approach Distributed computational load balancing for real-time applications Nonconvulsive epileptic seizures detection using multiway data analysis Performance improvement for wideband beamforming with white noise reduction based on sparse arrays Wideband DoA estimation based on joint optimisation of array and spatial sparsity
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1