{"title":"Maximizing Range Sum in External Memory","authors":"Dong-Wan Choi, C. Chung, Yufei Tao","doi":"10.1145/2629477","DOIUrl":null,"url":null,"abstract":"This article studies the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangle r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based services such as finding the best place for a new franchise store with a limited delivery range and finding the hotspot with the largest number of nearby attractions for a tourist with a limited reachable range. However, the problem has been studied mainly in the theoretical perspective, particularly in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms that do not guarantee the scalability. In this article, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem that is optimal in terms of the I/O complexity. In addition, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm.\n Furthermore, motivated by the fact that all the existing solutions simply assume that there is no tied area for the best location, we extend the MaxRS problem to a more fundamental problem, namely AllMaxRS, so that all the locations with the same best score can be retrieved. We first prove that the AllMaxRS problem cannot be trivially solved by applying the techniques for the MaxRS problem. Then we propose an output-sensitive external-memory algorithm (TwoPhaseMaxRS) that gives the exact solution for the AllMaxRS problem through two phases. Also, we prove both the soundness and completeness of the result returned from TwoPhaseMaxRS.\n From extensive experimental results, we show that ExactMaxRS and ApproxMaxCRS are several orders of magnitude faster than methods adapted from existing algorithms, the approximation bound in practice is much better than the theoretical bound of ApproxMaxCRS, and TwoPhaseMaxRS is not only much faster but also more robust than the straightforward extension of ExactMaxRS.","PeriodicalId":50915,"journal":{"name":"ACM Transactions on Database Systems","volume":"68 1","pages":"21:1-21:44"},"PeriodicalIF":2.2000,"publicationDate":"2014-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2629477","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 25
Abstract
This article studies the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangle r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based services such as finding the best place for a new franchise store with a limited delivery range and finding the hotspot with the largest number of nearby attractions for a tourist with a limited reachable range. However, the problem has been studied mainly in the theoretical perspective, particularly in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms that do not guarantee the scalability. In this article, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem that is optimal in terms of the I/O complexity. In addition, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm.
Furthermore, motivated by the fact that all the existing solutions simply assume that there is no tied area for the best location, we extend the MaxRS problem to a more fundamental problem, namely AllMaxRS, so that all the locations with the same best score can be retrieved. We first prove that the AllMaxRS problem cannot be trivially solved by applying the techniques for the MaxRS problem. Then we propose an output-sensitive external-memory algorithm (TwoPhaseMaxRS) that gives the exact solution for the AllMaxRS problem through two phases. Also, we prove both the soundness and completeness of the result returned from TwoPhaseMaxRS.
From extensive experimental results, we show that ExactMaxRS and ApproxMaxCRS are several orders of magnitude faster than methods adapted from existing algorithms, the approximation bound in practice is much better than the theoretical bound of ApproxMaxCRS, and TwoPhaseMaxRS is not only much faster but also more robust than the straightforward extension of ExactMaxRS.
期刊介绍:
Heavily used in both academic and corporate R&D settings, ACM Transactions on Database Systems (TODS) is a key publication for computer scientists working in data abstraction, data modeling, and designing data management systems. Topics include storage and retrieval, transaction management, distributed and federated databases, semantics of data, intelligent databases, and operations and algorithms relating to these areas. In this rapidly changing field, TODS provides insights into the thoughts of the best minds in database R&D.