Concatenated phoneme models for text-variable speaker recognition

Tomoko Matsui, S. Furui
{"title":"Concatenated phoneme models for text-variable speaker recognition","authors":"Tomoko Matsui, S. Furui","doi":"10.1109/ICASSP.1993.319321","DOIUrl":null,"url":null,"abstract":"Methods that create models to specify both speaker and phonetic information accurately by using only a small amount of training data for each speaker are investigated. For a text-dependent speaker recognition method, in which arbitrary key texts are prompted from the recognizer, speaker-specific phoneme models are necessary to identify the key text and recognize the speaker. Two methods of making speaker-specific phoneme models are discussed: phoneme-adaptation of a phoneme-independent speaker model and speaker-adaptation of universal phoneme models. The authors also investigate supplementing these methods by adding a phoneme-independent speaker model to make up for the lack of speaker information. This combination achieves a rejection rate as high as 98.5% for speech that differs from the key text and a speaker verification rate of 100.0%.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"134","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1993.319321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 134

Abstract

Methods that create models to specify both speaker and phonetic information accurately by using only a small amount of training data for each speaker are investigated. For a text-dependent speaker recognition method, in which arbitrary key texts are prompted from the recognizer, speaker-specific phoneme models are necessary to identify the key text and recognize the speaker. Two methods of making speaker-specific phoneme models are discussed: phoneme-adaptation of a phoneme-independent speaker model and speaker-adaptation of universal phoneme models. The authors also investigate supplementing these methods by adding a phoneme-independent speaker model to make up for the lack of speaker information. This combination achieves a rejection rate as high as 98.5% for speech that differs from the key text and a speaker verification rate of 100.0%.<>
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
文本变量说话人识别的连接音素模型
研究了利用少量的训练数据建立模型来准确指定说话人和语音信息的方法。基于文本的说话人识别方法是由识别器提示任意关键文本,需要特定于说话人的音素模型来识别关键文本并识别说话人。讨论了两种建立说话人特定音素模型的方法:独立音素的说话人音素适应模型和通用音素模型的说话人适应模型。作者还研究了通过添加与音素无关的说话人模型来补充这些方法,以弥补说话人信息的缺失。这种组合实现了与关键文本不同的语音的拒绝率高达98.5%,以及100.0%的说话人验证率
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Modeling, analysis, and compensation of quantization effects in M-band subband codecs Matched field source detection and localization in high noise environments: A novel reduced-rank signal processing approach Enhanced video compression with standardized bit stream syntax Formally correct translation of DSP algorithms specified in an asynchronous applicative language All-thru DSP provision, essential for the modern EE
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1