{"title":"A Comparison of Automatic Search Query Enhancement Algorithms That Utilise Wikipedia as a Source of A Priori Knowledge","authors":"Kyle Goslin, M. Hofmann","doi":"10.1145/3158354.3158356","DOIUrl":null,"url":null,"abstract":"This paper describes the benchmarking and analysis of five Automatic Search Query Enhancement (ASQE) algorithms that utilise Wikipedia as the sole source for a priori knowledge. The contributions of this paper include: 1) A comprehensive review into current ASQE algorithms that utilise Wikipedia as the sole source for a priori knowledge; 2) benchmarking of five existing ASQE algorithms using the TREC-9 Web Topics on the ClueWeb12 data set and 3) analysis of the results from the benchmarking process to identify the strengths and weaknesses each algorithm. During the benchmarking process, 2,500 relevance assessments were performed. Results of these tests are analysed using the Average Precision @10 per query and Mean Average Precision @10 per algorithm. From this analysis we show that the scope of a priori knowledge utilised during enhancement and the available term weighting methods available from Wikipedia can further aid the ASQE process. Although approaches taken by the algorithms are still relevant, an over dependence on weighting schemes and data sources used can easily impact results of an ASQE algorithm.","PeriodicalId":306212,"journal":{"name":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3158354.3158356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper describes the benchmarking and analysis of five Automatic Search Query Enhancement (ASQE) algorithms that utilise Wikipedia as the sole source for a priori knowledge. The contributions of this paper include: 1) A comprehensive review into current ASQE algorithms that utilise Wikipedia as the sole source for a priori knowledge; 2) benchmarking of five existing ASQE algorithms using the TREC-9 Web Topics on the ClueWeb12 data set and 3) analysis of the results from the benchmarking process to identify the strengths and weaknesses each algorithm. During the benchmarking process, 2,500 relevance assessments were performed. Results of these tests are analysed using the Average Precision @10 per query and Mean Average Precision @10 per algorithm. From this analysis we show that the scope of a priori knowledge utilised during enhancement and the available term weighting methods available from Wikipedia can further aid the ASQE process. Although approaches taken by the algorithms are still relevant, an over dependence on weighting schemes and data sources used can easily impact results of an ASQE algorithm.
本文描述了五种自动搜索查询增强(ASQE)算法的基准测试和分析,这些算法利用维基百科作为先验知识的唯一来源。本文的贡献包括:1)对当前使用维基百科作为先验知识唯一来源的ASQE算法进行了全面回顾;2)在ClueWeb12数据集上使用TREC-9 Web Topics对现有的五种ASQE算法进行基准测试;3)分析基准测试过程的结果,确定每种算法的优缺点。在对标过程中,进行了2500次相关评估。使用每个查询的平均精度@10和每个算法的平均精度@10来分析这些测试的结果。从这个分析中,我们发现在增强过程中使用的先验知识的范围和维基百科中可用的术语加权方法可以进一步帮助ASQE过程。尽管算法所采用的方法仍然是相关的,但过度依赖所使用的加权方案和数据源很容易影响ASQE算法的结果。