{"title":"Query modeling for spoken document retrieval","authors":"Berlin Chen, Pei-Ning Chen, Kuan-Yu Chen","doi":"10.1109/ASRU.2011.6163963","DOIUrl":null,"url":null,"abstract":"Spoken document retrieval (SDR) has recently become a more interesting research avenue due to increasing volumes of publicly available multimedia associated with speech information. Many efforts have been devoted to developing elaborate indexing and modeling techniques for representing spoken documents, but only few to improving query formulations for better representing the users' information needs. In view of this, we recently presented a language modeling framework exploring a novel use of relevance information cues for improving query effectiveness. Our work in this paper continues this general line of research in two main aspects. We further explore various ways to glean both relevance and non-relevance cues from the spoken document collection so as to enhance query modeling in an unsupervised fashion. Furthermore, we also investigate representing the query and documents with different granularities of index features to work in conjunction with the various relevance and/or non-relevance cues. Experiments conducted on the TDT (Topic Detection and Tracking) SDR task demonstrate the performance merits of the methods instantiated from our retrieval framework when compared to other existing retrieval methods.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2011.6163963","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Spoken document retrieval (SDR) has recently become a more interesting research avenue due to increasing volumes of publicly available multimedia associated with speech information. Many efforts have been devoted to developing elaborate indexing and modeling techniques for representing spoken documents, but only few to improving query formulations for better representing the users' information needs. In view of this, we recently presented a language modeling framework exploring a novel use of relevance information cues for improving query effectiveness. Our work in this paper continues this general line of research in two main aspects. We further explore various ways to glean both relevance and non-relevance cues from the spoken document collection so as to enhance query modeling in an unsupervised fashion. Furthermore, we also investigate representing the query and documents with different granularities of index features to work in conjunction with the various relevance and/or non-relevance cues. Experiments conducted on the TDT (Topic Detection and Tracking) SDR task demonstrate the performance merits of the methods instantiated from our retrieval framework when compared to other existing retrieval methods.