Rui Zhang, Jianzhong Qi, Martin Stradling, Jin Huang
{"title":"迈向空间对象的无痛索引","authors":"Rui Zhang, Jianzhong Qi, Martin Stradling, Jin Huang","doi":"10.1145/2629333","DOIUrl":null,"url":null,"abstract":"Conventional spatial indexes, represented by the R-tree, employ multidimensional tree structures that are complicated and require enormous efforts to implement in a full-fledged database management system (DBMS). An alternative approach for supporting spatial queries is mapping-based indexing, which maps both data and queries into a one-dimensional space such that data can be indexed and queries can be processed through a one-dimensional indexing structure such as the B+. Mapping-based indexing requires implementing only a few mapping functions, incurring much less effort in implementation compared to conventional spatial index structures. Yet, a major concern about using mapping-based indexes is their lower efficiency than conventional tree structures.\n In this article, we propose a mapping-based spatial indexing scheme called Size Separation Indexing (SSI). SSI is equipped with a suite of techniques including size separation, data distribution transformation, and more efficient mapping algorithms. These techniques overcome the drawbacks of existing mapping-based indexes and significantly improve the efficiency of query processing. We show through extensive experiments that, for window queries on spatial objects with nonzero extents, SSI has two orders of magnitude better performance than existing mapping-based indexes and competitive performance to the R-tree as a standalone implementation. We have also implemented SSI on top of two off-the-shelf DBMSs, PostgreSQL and a commercial platform, both having R-tree implementation. In this case, SSI is up to two orders of magnitude faster than their provided spatial indexes. Therefore, we achieve a spatial index more efficient than the R-tree in a DBMS implementation that is at the same time easy to implement. This result may upset a common perception that has existed for a long time in this area that the R-tree is the best choice for indexing spatial objects.","PeriodicalId":50915,"journal":{"name":"ACM Transactions on Database Systems","volume":"26 1","pages":"19:1-19:42"},"PeriodicalIF":2.2000,"publicationDate":"2014-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Towards a Painless Index for Spatial Objects\",\"authors\":\"Rui Zhang, Jianzhong Qi, Martin Stradling, Jin Huang\",\"doi\":\"10.1145/2629333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conventional spatial indexes, represented by the R-tree, employ multidimensional tree structures that are complicated and require enormous efforts to implement in a full-fledged database management system (DBMS). An alternative approach for supporting spatial queries is mapping-based indexing, which maps both data and queries into a one-dimensional space such that data can be indexed and queries can be processed through a one-dimensional indexing structure such as the B+. Mapping-based indexing requires implementing only a few mapping functions, incurring much less effort in implementation compared to conventional spatial index structures. Yet, a major concern about using mapping-based indexes is their lower efficiency than conventional tree structures.\\n In this article, we propose a mapping-based spatial indexing scheme called Size Separation Indexing (SSI). SSI is equipped with a suite of techniques including size separation, data distribution transformation, and more efficient mapping algorithms. These techniques overcome the drawbacks of existing mapping-based indexes and significantly improve the efficiency of query processing. We show through extensive experiments that, for window queries on spatial objects with nonzero extents, SSI has two orders of magnitude better performance than existing mapping-based indexes and competitive performance to the R-tree as a standalone implementation. We have also implemented SSI on top of two off-the-shelf DBMSs, PostgreSQL and a commercial platform, both having R-tree implementation. In this case, SSI is up to two orders of magnitude faster than their provided spatial indexes. Therefore, we achieve a spatial index more efficient than the R-tree in a DBMS implementation that is at the same time easy to implement. This result may upset a common perception that has existed for a long time in this area that the R-tree is the best choice for indexing spatial objects.\",\"PeriodicalId\":50915,\"journal\":{\"name\":\"ACM Transactions on Database Systems\",\"volume\":\"26 1\",\"pages\":\"19:1-19:42\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2014-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Database Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/2629333\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2629333","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Conventional spatial indexes, represented by the R-tree, employ multidimensional tree structures that are complicated and require enormous efforts to implement in a full-fledged database management system (DBMS). An alternative approach for supporting spatial queries is mapping-based indexing, which maps both data and queries into a one-dimensional space such that data can be indexed and queries can be processed through a one-dimensional indexing structure such as the B+. Mapping-based indexing requires implementing only a few mapping functions, incurring much less effort in implementation compared to conventional spatial index structures. Yet, a major concern about using mapping-based indexes is their lower efficiency than conventional tree structures.
In this article, we propose a mapping-based spatial indexing scheme called Size Separation Indexing (SSI). SSI is equipped with a suite of techniques including size separation, data distribution transformation, and more efficient mapping algorithms. These techniques overcome the drawbacks of existing mapping-based indexes and significantly improve the efficiency of query processing. We show through extensive experiments that, for window queries on spatial objects with nonzero extents, SSI has two orders of magnitude better performance than existing mapping-based indexes and competitive performance to the R-tree as a standalone implementation. We have also implemented SSI on top of two off-the-shelf DBMSs, PostgreSQL and a commercial platform, both having R-tree implementation. In this case, SSI is up to two orders of magnitude faster than their provided spatial indexes. Therefore, we achieve a spatial index more efficient than the R-tree in a DBMS implementation that is at the same time easy to implement. This result may upset a common perception that has existed for a long time in this area that the R-tree is the best choice for indexing spatial objects.
期刊介绍:
Heavily used in both academic and corporate R&D settings, ACM Transactions on Database Systems (TODS) is a key publication for computer scientists working in data abstraction, data modeling, and designing data management systems. Topics include storage and retrieval, transaction management, distributed and federated databases, semantics of data, intelligent databases, and operations and algorithms relating to these areas. In this rapidly changing field, TODS provides insights into the thoughts of the best minds in database R&D.