Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch

S. Vuuren
{"title":"Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch","authors":"S. Vuuren","doi":"10.21437/ICSLP.1996-454","DOIUrl":null,"url":null,"abstract":"We compare speaker recognition performance of vector quantization (VQ), Gaussian mixture modeling (GMM) and the Arithmetic Harmonic Sphericity measure (AHS) in adverse telephone speech conditions. The aim is to address the question: how do multimodal VQ and GMM typically compare to the simpler unimodal AHS for matched and mismatched training and testing environments? We study identification (closed set) and verification errors on a new multi environment database. We consider LPC and PLP features as well as their RASTA derivatives. We conclude that RASTA processing can remove redundancies from the features. We affirm that even when we use channel and noise compensation schemes, speaker recognition errors remain high when there is acoustic mismatch.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"25 1","pages":"1788-1791"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : ICSLP. International Conference on Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1996-454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

Abstract

We compare speaker recognition performance of vector quantization (VQ), Gaussian mixture modeling (GMM) and the Arithmetic Harmonic Sphericity measure (AHS) in adverse telephone speech conditions. The aim is to address the question: how do multimodal VQ and GMM typically compare to the simpler unimodal AHS for matched and mismatched training and testing environments? We study identification (closed set) and verification errors on a new multi environment database. We consider LPC and PLP features as well as their RASTA derivatives. We conclude that RASTA processing can remove redundancies from the features. We affirm that even when we use channel and noise compensation schemes, speaker recognition errors remain high when there is acoustic mismatch.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
声学失配电话语音的文本无关说话人识别方法比较
我们比较了矢量量化(VQ)、高斯混合建模(GMM)和算术谐波球度测量(AHS)在不利语音条件下的说话人识别性能。目的是解决以下问题:在匹配和不匹配的训练和测试环境中,多模式VQ和GMM通常如何与更简单的单模态AHS进行比较?我们研究了一个新的多环境数据库的识别(闭集)和验证误差。我们考虑了LPC和PLP特性以及它们的RASTA衍生产品。我们得出结论,RASTA处理可以从特征中去除冗余。我们确认,即使我们使用通道和噪声补偿方案,当存在声学不匹配时,说话人识别误差仍然很高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Audiovisual integration of speech by children and adults with cochlear implants AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS. Efficient adaptation of TTS duration model to new speakers SABLE: a standard for TTS markup A three-dimensional linear articulatory model based on MRI data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1