使用长度相关阈值改进的语音抑制

5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI:10.21437/ICSLP.1998-425

Sunil K. Gupta, F. Soong

{"title":"使用长度相关阈值改进的语音抑制","authors":"Sunil K. Gupta, F. Soong","doi":"10.21437/ICSLP.1998-425","DOIUrl":null,"url":null,"abstract":"In this paper, we propose to use an utterance length (duration) dependent threshold for rejecting an unknown input utterance with a general speech(garbage) model. A general speech model, com-paring with more sophisticated anti-subword models, is a more viable solution to the utterance rejection problem for low-cost ap-plications with stringent storage and computational constraints. However, the rejection performance using such a general model with a ﬁxed, universal rejection threshold is in general worse than the anti-models with higher discriminations. Without adding complexities to the rejection algorithm, we propose to vary the rejection threshold according to the utterance length. The experimental results show that signiﬁcant improvement in rejection performance can be obtained by using the proposed, length dependent rejection threshold over a ﬁxed threshold. We investigate utterance rejection in a command phrase recognition task. The equal error rate, a good ﬁgure of merit for calibrating the performance of utterance veriﬁcation algorithms, is reduced by almost 23% when the proposed length dependent threshold is used.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Improved utterance rejection using length dependent thresholds\",\"authors\":\"Sunil K. Gupta, F. Soong\",\"doi\":\"10.21437/ICSLP.1998-425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose to use an utterance length (duration) dependent threshold for rejecting an unknown input utterance with a general speech(garbage) model. A general speech model, com-paring with more sophisticated anti-subword models, is a more viable solution to the utterance rejection problem for low-cost ap-plications with stringent storage and computational constraints. However, the rejection performance using such a general model with a ﬁxed, universal rejection threshold is in general worse than the anti-models with higher discriminations. Without adding complexities to the rejection algorithm, we propose to vary the rejection threshold according to the utterance length. The experimental results show that signiﬁcant improvement in rejection performance can be obtained by using the proposed, length dependent rejection threshold over a ﬁxed threshold. We investigate utterance rejection in a command phrase recognition task. The equal error rate, a good ﬁgure of merit for calibrating the performance of utterance veriﬁcation algorithms, is reduced by almost 23% when the proposed length dependent threshold is used.\",\"PeriodicalId\":117113,\"journal\":{\"name\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/ICSLP.1998-425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

在本文中，我们建议使用一个依赖于话语长度(持续时间)的阈值来拒绝一个未知的输入话语，并使用一个通用的语音(垃圾)模型。对于具有严格存储和计算约束的低成本应用程序，通用语音模型比更复杂的反子词模型更能解决语音拒绝问题。然而，使用这种具有固定的通用拒绝阈值的通用模型的拒绝性能通常比具有较高判别的反模型差。在不增加拒绝算法复杂性的前提下，我们建议根据话语长度改变拒绝阈值。实验结果表明，在固定阈值的基础上，采用基于长度的拒止阈值可以显著提高拒止性能。我们研究命令短语识别任务中的话语拒绝。当使用所提出的长度相关阈值时，相等错误率减少了近23%，这是校准话语验证算法性能的一个很好的指标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improved utterance rejection using length dependent thresholds

In this paper, we propose to use an utterance length (duration) dependent threshold for rejecting an unknown input utterance with a general speech(garbage) model. A general speech model, com-paring with more sophisticated anti-subword models, is a more viable solution to the utterance rejection problem for low-cost ap-plications with stringent storage and computational constraints. However, the rejection performance using such a general model with a ﬁxed, universal rejection threshold is in general worse than the anti-models with higher discriminations. Without adding complexities to the rejection algorithm, we propose to vary the rejection threshold according to the utterance length. The experimental results show that signiﬁcant improvement in rejection performance can be obtained by using the proposed, length dependent rejection threshold over a ﬁxed threshold. We investigate utterance rejection in a command phrase recognition task. The equal error rate, a good ﬁgure of merit for calibrating the performance of utterance veriﬁcation algorithms, is reduced by almost 23% when the proposed length dependent threshold is used.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

5th International Conference on Spoken Language Processing (ICSLP 1998)

自引率

0.00%

发文量

期刊最新文献

Assimilation of place in Japanese and dutch Articulatory analysis using a codebook for articulatory based low bit-rate speech coding Phonetic and phonological characteristics of paralinguistic information in spoken Japanese HMM-based visual speech recognition using intensity and location normalization Speech recognition via phonetically featured syllables