说话人语音变异性与语音识别性能关系的研究

S. Tsuge, Minoru Fukumi, M. Shishibori, Fuji Ren, K. Kita, S. Kuroiwa
{"title":"说话人语音变异性与语音识别性能关系的研究","authors":"S. Tsuge, Minoru Fukumi, M. Shishibori, Fuji Ren, K. Kita, S. Kuroiwa","doi":"10.1109/ISPACS.2006.364831","DOIUrl":null,"url":null,"abstract":"Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker's speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker's speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker's speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors","PeriodicalId":178644,"journal":{"name":"2006 International Symposium on Intelligent Signal Processing and Communications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Study of Relationships between Intra-speaker's Speech Variability and Speech Recognition Performance\",\"authors\":\"S. Tsuge, Minoru Fukumi, M. Shishibori, Fuji Ren, K. Kita, S. Kuroiwa\",\"doi\":\"10.1109/ISPACS.2006.364831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker's speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker's speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker's speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors\",\"PeriodicalId\":178644,\"journal\":{\"name\":\"2006 International Symposium on Intelligent Signal Processing and Communications\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 International Symposium on Intelligent Signal Processing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPACS.2006.364831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 International Symposium on Intelligent Signal Processing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPACS.2006.364831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

即使说话人使用依赖说话人的语音识别系统,语音识别性能也会有所不同。因此,即使说话者和话语保持不变,语音质量也会受到一些因素的影响,包括情绪、背景噪音等。然而,说话人内部的语音变异性与语音识别性能之间的关系尚不清楚。因此,我们关注的是影响语音识别性能的说话人内部的语音变异性。为了研究这些关系,我们从2002年11月开始收集语音数据。利用部分语音语料库进行了语音识别实验。本文采用相关分析的方法分析了说话人的言语变异与音素准确度之间的关系。对于相关分析的因素,我们使用一些错误,说话率,可能性。分析结果表明,替换错误数与音素正确率之间存在较强的相关性,而缺失错误数与插入错误数之间的相关性较低。因此,我们认为音素之间存在重叠,因为在每一个说话速率下,音素的特征参数是不同的。为了提高音位识别的准确性,需要研究一种音位识别方法。另一方面,虽然音素正确率与说话率之间的相关性似乎很低,但说话率与删除错误和插入错误的数量之间存在很强的相关性。由于插入错误数与删除错误数呈平衡关系,因此语速与音位正确率的相关性较低。然而,我们认为有必要对语速进行归一化,因为语速会影响删除和插入错误的数量
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Study of Relationships between Intra-speaker's Speech Variability and Speech Recognition Performance
Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker's speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker's speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker's speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Lossy Strict Multilevel Successive Elimination Algorithm for Fast Motion Estimation A Subpixel Image Matching Technique Using Phase-Only Correlation Phase Unwrapping of Self-mixing Signals Observed in Optical Feedback Interferometry for Displacement Measurement A Low-Power and Low-Noise Amplifier for 3-5GHz UWB Applications Automatic Image Annotation based-on Rough Set Theory with Visual Keys
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1