Fast Filtering of Search Results Sorted by Attribute

ACM Transactions on Information Systems (TOIS) Pub Date : 2021-11-24 DOI:10.1145/3477982

F. M. Nardini, Roberto Trani, Rossano Venturini, F. M. Nardini, Roberto Trani

{"title":"Fast Filtering of Search Results Sorted by Attribute","authors":"F. M. Nardini, Roberto Trani, Rossano Venturini, F. M. Nardini, Roberto Trani","doi":"10.1145/3477982","DOIUrl":null,"url":null,"abstract":"Modern search services often provide multiple options to rank the search results, e.g., sort “by relevance”, “by price” or “by discount” in e-commerce. While the traditional rank by relevance effectively places the relevant results in the top positions of the results list, the rank by attribute could place many marginally relevant results in the head of the results list leading to poor user experience. In the past, this issue has been addressed by investigating the relevance-aware filtering problem, which asks to select the subset of results maximizing the relevance of the attribute-sorted list. Recently, an exact algorithm has been proposed to solve this problem optimally. However, the high computational cost of the algorithm makes it impractical for the Web search scenario, which is characterized by huge lists of results and strict time constraints. For this reason, the problem is often solved using efficient yet inaccurate heuristic algorithms. In this article, we first prove the performance bounds of the existing heuristics. We then propose two efficient and effective algorithms to solve the relevance-aware filtering problem. First, we propose OPT-Filtering, a novel exact algorithm that is faster than the existing state-of-the-art optimal algorithm. Second, we propose an approximate and even more efficient algorithm, ϵ-Filtering, which, given an allowed approximation error ϵ, finds a (1-ϵ)–optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of the two proposed algorithms against state-of-the-art competitors on two real-world public datasets. Experimental results show that OPT-Filtering achieves a significant speedup of up to two orders of magnitude with respect to the existing optimal solution, while ϵ-Filtering further improves this result by trading effectiveness for efficiency. In particular, experiments show that ϵ-Filtering can achieve quasi-optimal solutions while being faster than all state-of-the-art competitors in most of the tested configurations.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"26 1","pages":"1 - 24"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems (TOIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3477982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Modern search services often provide multiple options to rank the search results, e.g., sort “by relevance”, “by price” or “by discount” in e-commerce. While the traditional rank by relevance effectively places the relevant results in the top positions of the results list, the rank by attribute could place many marginally relevant results in the head of the results list leading to poor user experience. In the past, this issue has been addressed by investigating the relevance-aware filtering problem, which asks to select the subset of results maximizing the relevance of the attribute-sorted list. Recently, an exact algorithm has been proposed to solve this problem optimally. However, the high computational cost of the algorithm makes it impractical for the Web search scenario, which is characterized by huge lists of results and strict time constraints. For this reason, the problem is often solved using efficient yet inaccurate heuristic algorithms. In this article, we first prove the performance bounds of the existing heuristics. We then propose two efficient and effective algorithms to solve the relevance-aware filtering problem. First, we propose OPT-Filtering, a novel exact algorithm that is faster than the existing state-of-the-art optimal algorithm. Second, we propose an approximate and even more efficient algorithm, ϵ-Filtering, which, given an allowed approximation error ϵ, finds a (1-ϵ)–optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of the two proposed algorithms against state-of-the-art competitors on two real-world public datasets. Experimental results show that OPT-Filtering achieves a significant speedup of up to two orders of magnitude with respect to the existing optimal solution, while ϵ-Filtering further improves this result by trading effectiveness for efficiency. In particular, experiments show that ϵ-Filtering can achieve quasi-optimal solutions while being faster than all state-of-the-art competitors in most of the tested configurations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

快速过滤按属性排序的搜索结果

现代搜索服务通常提供多种选项来对搜索结果进行排序，例如，在电子商务中，“按相关性”、“按价格”或“按折扣”排序。虽然传统的相关性排名有效地将相关结果放在结果列表的顶部位置，但按属性排名可能会将许多不太相关的结果放在结果列表的顶部，从而导致糟糕的用户体验。在过去，这个问题已经通过研究相关性感知过滤问题来解决，该问题要求选择使属性排序列表的相关性最大化的结果子集。最近，人们提出了一种精确的算法来最优地解决这一问题。然而，该算法的高计算成本使其不适合具有巨大结果列表和严格时间限制的Web搜索场景。由于这个原因，通常使用高效但不准确的启发式算法来解决问题。在本文中，我们首先证明了现有启发式算法的性能界限。然后，我们提出了两种高效的算法来解决关联感知过滤问题。首先，我们提出了一种新的精确算法OPT-Filtering，它比现有的最优算法更快。其次，我们提出了一种近似且更有效的算法ϵ-Filtering，该算法在给定允许的近似误差λ的情况下，找到(1- λ)最优滤波，即其解的相关性至少是(1- λ)最优滤波的倍。我们在两个真实世界的公共数据集上对两种提议的算法进行了全面的评估，以对抗最先进的竞争对手。实验结果表明，OPT-Filtering相对于现有的最优解获得了高达两个数量级的显著加速，而ϵ-Filtering通过以效率为代价的有效性进一步改善了这一结果。特别是，实验表明ϵ-Filtering可以在大多数测试配置中获得准最优解，同时比所有最先进的竞争对手更快。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Information Systems (TOIS)

自引率

0.00%

发文量