Domain dependent query reformulation for web search

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI:10.1145/2396761.2398401

Van Dang, G. Kumaran, Adam D. Troy

{"title":"Domain dependent query reformulation for web search","authors":"Van Dang, G. Kumaran, Adam D. Troy","doi":"10.1145/2396761.2398401","DOIUrl":null,"url":null,"abstract":"Query reformulation has been studied as a domain independent task. Existing work attempts to expand a query or substitute its terms with the same set of candidates regardless of the domain of this query. Since terms might be semantically related in one domain but not in others, it is more effective to provide candidates for queries with respect to their domain. This paper demonstrates the advantage of this domain dependent query reformulation approach, which learns its candidates, using a standard technique, for each domain from a separate sample of data derived automatically from a generic query log. Our results show that our approach statistically significantly outperforms the domain independent approach, which learns to reformulate from the same log using the same technique, on a large query set consisting of both health and commerce queries. Our results have very practical interpretation: while building different reformulation systems to handle queries from different domains does not require additional manual effort, it provides substantially better retrieval effectiveness than having a single system handling all queries. Additionally, we show that leveraging domain specific manually labelled data leads to further improvement.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"201 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2396761.2398401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Query reformulation has been studied as a domain independent task. Existing work attempts to expand a query or substitute its terms with the same set of candidates regardless of the domain of this query. Since terms might be semantically related in one domain but not in others, it is more effective to provide candidates for queries with respect to their domain. This paper demonstrates the advantage of this domain dependent query reformulation approach, which learns its candidates, using a standard technique, for each domain from a separate sample of data derived automatically from a generic query log. Our results show that our approach statistically significantly outperforms the domain independent approach, which learns to reformulate from the same log using the same technique, on a large query set consisting of both health and commerce queries. Our results have very practical interpretation: while building different reformulation systems to handle queries from different domains does not require additional manual effort, it provides substantially better retrieval effectiveness than having a single system handling all queries. Additionally, we show that leveraging domain specific manually labelled data leads to further improvement.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向web搜索的域相关查询重构

查询重构是一个独立于领域的任务。现有的工作尝试扩展查询或用相同的候选项集替换其术语，而不管该查询的域是什么。由于术语可能在一个域中具有语义相关性，而在其他域中则没有，因此提供与其域相关的查询候选者会更有效。本文演示了这种依赖于域的查询重新表述方法的优点，该方法使用一种标准技术，从自动从通用查询日志中派生的单独数据样本中学习每个域的候选对象。我们的结果表明，我们的方法在统计上显著优于领域独立方法，后者在由健康和商业查询组成的大型查询集上学习使用相同的技术从相同的日志中重新表述。我们的结果有非常实际的解释:虽然构建不同的重新表述系统来处理来自不同领域的查询不需要额外的手工工作，但它提供了比使用单个系统处理所有查询更好的检索效率。此外，我们还展示了利用特定领域的手动标记数据可以进一步改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 21st ACM international conference on Information and knowledge management

自引率

0.00%

发文量

期刊最新文献

Predicting web search success with fine-grained interaction data User activity profiling with multi-layer analysis Search result presentation based on faceted clustering Domain dependent query reformulation for web search CrowdTiles: presenting crowd-based information for event-driven information needs