Towards robust methods for spoken document retrieval

5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI:10.21437/ICSLP.1998-480

Kenney Ng

引用次数: 43

Abstract

In this paper, we investigate a number of robust indexing and retrieval methods in an effort to improve spoken document retrieval performance in the presence of speech recognition errors. In particular, we examine expanding the original query representation to include confusible terms; developing a new document-query retrieval measure based on approximate matching that is less sensitive to recognition errors; expanding the document representation to include multiple recognition hypotheses; modifying the original query using automatic relevance feedback to include new terms found in the top ranked documents; and combining information from multiple subword unit representations. We study the different methods individually and then explore the effects of combining them. Experiments on radio broadcast news data show that using a combination of these methods can improve retrieval performance by over 20%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向口语文档检索的鲁棒方法

在本文中，我们研究了一些鲁棒索引和检索方法，以努力提高存在语音识别错误的语音文档检索性能。特别地，我们研究了扩展原始查询表示以包含易混淆的术语;开发一种对识别错误不太敏感的基于近似匹配的文档查询检索方法;扩展文档表示以包含多个识别假设;使用自动相关性反馈修改原始查询，以包含在排名靠前的文档中发现的新术语;并结合来自多个子词单位表示的信息。我们分别研究了不同的方法，然后探讨了将它们结合起来的效果。在广播新闻数据上的实验表明，结合使用这些方法可以使检索性能提高20%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

5th International Conference on Spoken Language Processing (ICSLP 1998)

自引率

0.00%

发文量

期刊最新文献

Assimilation of place in Japanese and dutch Articulatory analysis using a codebook for articulatory based low bit-rate speech coding Phonetic and phonological characteristics of paralinguistic information in spoken Japanese HMM-based visual speech recognition using intensity and location normalization Speech recognition via phonetically featured syllables