Efficient spoken term discovery using randomized algorithms

2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI:10.1109/ASRU.2011.6163965

A. Jansen, Benjamin Van Durme

引用次数: 165

Abstract

Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately limited by the O(n2) nature of the search space. Recent strategies have attempted to improve search efficiency by using either unsupervised or mismatched-language acoustic models to reduce the complexity of the feature representation. Taking a completely different approach, this paper investigates the use of randomized algorithms that operate directly on the raw acoustic features to produce sparse approximate similarity matrices in O(n) space and O(n log n) time. We demonstrate these techniques facilitate spoken term discovery performance capable of outperforming a model-based strategy in the zero resource setting.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用随机化算法的高效口语术语发现

口语术语发现是通过搜索长时间重复的声学模式来自动识别语音数据中的单词和短语的任务。最初的解决方案依赖于在整个相似矩阵中基于时间扭曲的穷极动态搜索，这种方法的可伸缩性最终受到搜索空间的O(n2)性质的限制。最近的策略试图通过使用无监督或不匹配语言声学模型来降低特征表示的复杂性来提高搜索效率。采用完全不同的方法，本文研究了使用随机化算法直接对原始声学特征进行操作，在O(n)空间和O(n log n)时间内生成稀疏近似相似矩阵。我们演示了这些技术促进口语术语发现性能，能够在零资源设置中优于基于模型的策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量

期刊最新文献

Applying feature bagging for more accurate and robust automated speaking assessment Towards choosing better primes for spoken dialog systems Accent level adjustment in bilingual Thai-English text-to-speech synthesis Fast speaker diarization using a high-level scripting language Evaluating prosodic features for automated scoring of non-native read speech