Computing paths over a terrain that are highly occluded with respect to observers is an important problem in GIS. Given a fast algorithm for computing the visibility map, the path-planning step becomes the bottleneck. In this paper, we present an approach for quickly computing occluded paths over a terrain using a sparse network, a sparse 1-dimensional network over the terrain. We present different strategies for constructing the sparse network. Experimental results show that our approach results in significantly improved time for computing highly occluded paths between two query points, and that the different strategies offer a tradeoff between higher-quality paths and lower preprocessing times. Furthermore, there are strategies that achieve near-optimal paths with small preprocessing cost.
{"title":"Computing highly occluded paths using a sparse network","authors":"Niel Lebeck, Thomas Mølhave, P. Agarwal","doi":"10.1145/2666310.2666394","DOIUrl":"https://doi.org/10.1145/2666310.2666394","url":null,"abstract":"Computing paths over a terrain that are highly occluded with respect to observers is an important problem in GIS. Given a fast algorithm for computing the visibility map, the path-planning step becomes the bottleneck. In this paper, we present an approach for quickly computing occluded paths over a terrain using a sparse network, a sparse 1-dimensional network over the terrain. We present different strategies for constructing the sparse network. Experimental results show that our approach results in significantly improved time for computing highly occluded paths between two query points, and that the different strategies offer a tradeoff between higher-quality paths and lower preprocessing times. Furthermore, there are strategies that achieve near-optimal paths with small preprocessing cost.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127794687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents TREADS, a novel travel route recommendation system that suggests safe travel itineraries in real time by incorporating social media data resources and points of interest review summarization techniques. The system consists of an efficient route recommendation service that considers safety and user interest factors, a transportation related tweets retriever with high accuracy, and a novel text summarization module that provides summaries of location based Twitter data and Yelp reviews to enhance our route recommendation service. We demonstrate the system by utilizing crime and points of interest data in the Washington DC area. TREADS is targeted to provide safe, effective, and convenient travel strategies for commuters and tourists. Our proposed system, integrated with multiple social media resources, can greatly improve the travel experience for tourists in unfamiliar cities.
{"title":"TREADS: a safe route recommender using social media mining and text summarization","authors":"Kaiqun Fu, Yen-Cheng Lu, Chang-Tien Lu","doi":"10.1145/2666310.2666368","DOIUrl":"https://doi.org/10.1145/2666310.2666368","url":null,"abstract":"This paper presents TREADS, a novel travel route recommendation system that suggests safe travel itineraries in real time by incorporating social media data resources and points of interest review summarization techniques. The system consists of an efficient route recommendation service that considers safety and user interest factors, a transportation related tweets retriever with high accuracy, and a novel text summarization module that provides summaries of location based Twitter data and Yelp reviews to enhance our route recommendation service. We demonstrate the system by utilizing crime and points of interest data in the Washington DC area. TREADS is targeted to provide safe, effective, and convenient travel strategies for commuters and tourists. Our proposed system, integrated with multiple social media resources, can greatly improve the travel experience for tourists in unfamiliar cities.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132491531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Bolzoni, S. Helmer, Kevin Wellenzohn, J. Gamper, Periklis Andritsos
We propose a more realistic approach to trip planning for tourist applications by adding category information to points of interest (POIs). This makes it easier for tourists to formulate their preferences by stating constraints on categories rather than individual POIs. However, solving this problem is not just a matter of extending existing algorithms. In our approach we exploit the fact that POIs are usually not evenly distributed but tend to appear in clusters. We develop a group of efficient algorithms based on clustering with guaranteed theoretical bounds. We also evaluate our algorithms experimentally, using real-world data sets, showing that in practice the results are better than the theoretical guarantees and very close to the optimal solution.
{"title":"Efficient itinerary planning with category constraints","authors":"P. Bolzoni, S. Helmer, Kevin Wellenzohn, J. Gamper, Periklis Andritsos","doi":"10.1145/2666310.2666411","DOIUrl":"https://doi.org/10.1145/2666310.2666411","url":null,"abstract":"We propose a more realistic approach to trip planning for tourist applications by adding category information to points of interest (POIs). This makes it easier for tourists to formulate their preferences by stating constraints on categories rather than individual POIs. However, solving this problem is not just a matter of extending existing algorithms. In our approach we exploit the fact that POIs are usually not evenly distributed but tend to appear in clusters. We develop a group of efficient algorithms based on clustering with guaranteed theoretical bounds. We also evaluate our algorithms experimentally, using real-world data sets, showing that in practice the results are better than the theoretical guarantees and very close to the optimal solution.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131072948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chandan Misra, A. Dasgupta, S. Ghosh, D. Bhattacharyya
This work presents GeomSMS as the first full-fledged SMS framework with the native support for geometric objects for sharing spatial information ubiquitously across mobile users. GeomSMS is an extension to Open GeoSMS Standard by Open GeoSpatial Consortium (OGC) that provides developers a Short Message Service (SMS) encoding for sharing only location information, namely latitude and longitude, between location based services (LBS) and applications. GeomSMS keeps the GeoSMS standard as it is, but adds support for sharing two other geometric features: line and polygon, apart from existing point feature. GeomSMS shares these features in the SMS payload without altering the GeoSMS standard. We describe the architecture of the system that utilizes the framework and demonstrates a real-life mobile application BeckonMe with one example from each of line and polygon feature.
{"title":"A demonstration of GeomSMS: an SMS framework for sharing geospatial features","authors":"Chandan Misra, A. Dasgupta, S. Ghosh, D. Bhattacharyya","doi":"10.1145/2666310.2666372","DOIUrl":"https://doi.org/10.1145/2666310.2666372","url":null,"abstract":"This work presents GeomSMS as the first full-fledged SMS framework with the native support for geometric objects for sharing spatial information ubiquitously across mobile users. GeomSMS is an extension to Open GeoSMS Standard by Open GeoSpatial Consortium (OGC) that provides developers a Short Message Service (SMS) encoding for sharing only location information, namely latitude and longitude, between location based services (LBS) and applications. GeomSMS keeps the GeoSMS standard as it is, but adds support for sharing two other geometric features: line and polygon, apart from existing point feature. GeomSMS shares these features in the SMS payload without altering the GeoSMS standard. We describe the architecture of the system that utilizes the framework and demonstrates a real-life mobile application BeckonMe with one example from each of line and polygon feature.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130796137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmed Loai Ali, Falko Schmid, R. Al-Salman, Tomi Kauppinen
With the ubiquity of technology and tools, current Volunteered Geographic Information (VGI) projects allow the public to contribute, maintain, and use geo-spatial data. One of the most prominent and successful VGI project is OpenStreetMap (OSM), where more than one million volunteers collected and contributed data that is obtainable for everybody. However, this kind of contribution mechanism is usually associated with data quality issues, e.g., geographic entities such as gardens or parks can be assigned with inappropriate classification by volunteers. Based on the observation that geographic features usually inherit certain properties and characteristics, we propose a novel classification-based approach allowing the identification of entities with inappropriate classification. We use the rich data set of OSM to analyze the properties of geographic entities with respect to their implicit characteristics in order to develop classifiers based on them. Our developed classifiers show high detection accuracies. However, due to the absence of proper training data we additionally performed a user study to verify our findings by means of intra-user-agreement. The results of our study support the detections of our classifiers and show that our classification-based approaches can be a valuable tool for managing and improving VGI data.
{"title":"Ambiguity and plausibility: managing classification quality in volunteered geographic information","authors":"Ahmed Loai Ali, Falko Schmid, R. Al-Salman, Tomi Kauppinen","doi":"10.1145/2666310.2666392","DOIUrl":"https://doi.org/10.1145/2666310.2666392","url":null,"abstract":"With the ubiquity of technology and tools, current Volunteered Geographic Information (VGI) projects allow the public to contribute, maintain, and use geo-spatial data. One of the most prominent and successful VGI project is OpenStreetMap (OSM), where more than one million volunteers collected and contributed data that is obtainable for everybody. However, this kind of contribution mechanism is usually associated with data quality issues, e.g., geographic entities such as gardens or parks can be assigned with inappropriate classification by volunteers. Based on the observation that geographic features usually inherit certain properties and characteristics, we propose a novel classification-based approach allowing the identification of entities with inappropriate classification. We use the rich data set of OSM to analyze the properties of geographic entities with respect to their implicit characteristics in order to develop classifiers based on them. Our developed classifiers show high detection accuracies. However, due to the absence of proper training data we additionally performed a user study to verify our findings by means of intra-user-agreement. The results of our study support the detections of our classifiers and show that our classification-based approaches can be a valuable tool for managing and improving VGI data.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129607042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mining GPS trajectories of moving vehicles has led to many research directions, such as traffic modeling and driving predication. An important challenge is how to map GPS traces to a road network accurately under noisy conditions. However, to the best of our knowledge, there is no existing work that first simplifies a trajectory to improve map matching. In this paper we propose three trajectory simplification algorithms that can deal with both offline and online trajectory data. We use weighting functions to incorporate spatial knowledge, such as segment lengths and turning angles, into our simplification algorithms. In addition, we measure the noise degree of a GPS point based on its spatio-temporal relationship to its neighbors. The effectiveness of our algorithms is comprehensively evaluated on real trajectory datasets with varying the noise levels and sampling rates. Our evaluation shows that under highly noisy conditions, our proposed algorithms considerably improve map matching accuracy and reduce computational costs compared to the state-of-the-art methods.
{"title":"Spatio-temporal trajectory simplification for inferring travel paths","authors":"Hengfeng Li, L. Kulik, K. Ramamohanarao","doi":"10.1145/2666310.2666409","DOIUrl":"https://doi.org/10.1145/2666310.2666409","url":null,"abstract":"Mining GPS trajectories of moving vehicles has led to many research directions, such as traffic modeling and driving predication. An important challenge is how to map GPS traces to a road network accurately under noisy conditions. However, to the best of our knowledge, there is no existing work that first simplifies a trajectory to improve map matching. In this paper we propose three trajectory simplification algorithms that can deal with both offline and online trajectory data. We use weighting functions to incorporate spatial knowledge, such as segment lengths and turning angles, into our simplification algorithms. In addition, we measure the noise degree of a GPS point based on its spatio-temporal relationship to its neighbors. The effectiveness of our algorithms is comprehensively evaluated on real trajectory datasets with varying the noise levels and sampling rates. Our evaluation shows that under highly noisy conditions, our proposed algorithms considerably improve map matching accuracy and reduce computational costs compared to the state-of-the-art methods.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"123 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114026737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile users engage in novel and exciting location-based social media applications (e.g., geosocial networks, spatial crowdsourcing) in which they interact with other users situated in their proximity. In several application scenarios, users define their own proximity zones of interest (typically in the form of polygonal regions, such as a collection of city blocks), and want to find other users with whom they are in a mutual enclosure relationship with respect to their respective proximity zones. This boils down to evaluating two point-in-polygon enclosure conditions, which is easy to achieve for revealed user locations and proximity zones. However, users may be reluctant to share their whereabouts with their friends and with social media service providers, as location data can help one infer sensitive details such as an individual's health status, financial situation or lifestyle choices. In this paper, we propose a mechanism that allows users to securely evaluate mutual proximity zone enclosure on encrypted location data. Our solution uses homomorphic encryption, and supports convex polygonal proximity zones. We provide a security analysis of the proposed solution, we investigate performance optimizations, and we show experimentally that our approach scales well for datasets of millions of users.
{"title":"Secure mutual proximity zone enclosure evaluation","authors":"Sunoh Choi, Gabriel Ghinita, E. Bertino","doi":"10.1145/2666310.2666384","DOIUrl":"https://doi.org/10.1145/2666310.2666384","url":null,"abstract":"Mobile users engage in novel and exciting location-based social media applications (e.g., geosocial networks, spatial crowdsourcing) in which they interact with other users situated in their proximity. In several application scenarios, users define their own proximity zones of interest (typically in the form of polygonal regions, such as a collection of city blocks), and want to find other users with whom they are in a mutual enclosure relationship with respect to their respective proximity zones. This boils down to evaluating two point-in-polygon enclosure conditions, which is easy to achieve for revealed user locations and proximity zones. However, users may be reluctant to share their whereabouts with their friends and with social media service providers, as location data can help one infer sensitive details such as an individual's health status, financial situation or lifestyle choices. In this paper, we propose a mechanism that allows users to securely evaluate mutual proximity zone enclosure on encrypted location data. Our solution uses homomorphic encryption, and supports convex polygonal proximity zones. We provide a security analysis of the proposed solution, we investigate performance optimizations, and we show experimentally that our approach scales well for datasets of millions of users.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116021235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aaron T. Myers, S. Movva, R. Karthik, B. Bhaduri, D. White, N. Thomas, Adrian S Z Chase
The Bioenergy Knowledge Discovery Framework (BioenergyKDF) is a scalable, web-based collaborative environment for scientists working on bioenergy related research in which the connections between data, literature, and models can be explored and more clearly understood. The fully-operational and deployed system, built on multiple open source libraries and architectures, stores contributions from the community of practice and makes them easy to find, but that is just its base functionality. The BioenergyKDF provides a national spatiotemporal decision support capability that enables data sharing, analysis, modeling, and visualization as well as fosters the development and management of the U.S. bioenergy infrastructure, which is an essential component of the national energy infrastructure. The BioenergyKDF is built on a flexible, customizable platform that can be extended to support the requirements of any user community---especially those that work with spatiotemporal data. While there are several community data-sharing software platforms available, some developed and distributed by national governments, none of them have the full suite of capabilities available in BioenergyKDF. For example, this component-based platform and database independent architecture allows it to be quickly deployed to existing infrastructure and to connect to existing data repositories (spatial or otherwise). As new data, analysis, and features are added; the BioenergyKDF will help lead research and support decisions concerning bioenergy into the future, but will also enable the development and growth of additional communities of practice both inside and outside of the Department of Energy. These communities will be able to leverage the substantial investment the agency has made in the KDF platform to quickly stand up systems that are customized to their data and research needs.
{"title":"BioenergyKDF: enabling spatiotemporal data synthesis and research collaboration","authors":"Aaron T. Myers, S. Movva, R. Karthik, B. Bhaduri, D. White, N. Thomas, Adrian S Z Chase","doi":"10.1145/2666310.2666488","DOIUrl":"https://doi.org/10.1145/2666310.2666488","url":null,"abstract":"The Bioenergy Knowledge Discovery Framework (BioenergyKDF) is a scalable, web-based collaborative environment for scientists working on bioenergy related research in which the connections between data, literature, and models can be explored and more clearly understood. The fully-operational and deployed system, built on multiple open source libraries and architectures, stores contributions from the community of practice and makes them easy to find, but that is just its base functionality. The BioenergyKDF provides a national spatiotemporal decision support capability that enables data sharing, analysis, modeling, and visualization as well as fosters the development and management of the U.S. bioenergy infrastructure, which is an essential component of the national energy infrastructure. The BioenergyKDF is built on a flexible, customizable platform that can be extended to support the requirements of any user community---especially those that work with spatiotemporal data. While there are several community data-sharing software platforms available, some developed and distributed by national governments, none of them have the full suite of capabilities available in BioenergyKDF. For example, this component-based platform and database independent architecture allows it to be quickly deployed to existing infrastructure and to connect to existing data repositories (spatial or otherwise). As new data, analysis, and features are added; the BioenergyKDF will help lead research and support decisions concerning bioenergy into the future, but will also enable the development and growth of additional communities of practice both inside and outside of the Department of Energy. These communities will be able to leverage the substantial investment the agency has made in the KDF platform to quickly stand up systems that are customized to their data and research needs.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128635152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recording mobility data with GPS-enabled devices, e.g., smart phones or vehicles, has become a common issue for private persons, companies, and institutions. Consequently, the requirements for managing these enormous datasets have increased drastically, so trajectory management has become an active research field. In order to avoid querying raw trajectories, which is neither convenient nor efficient, a symbolic representation of the geometric data has been introduced. A comprehensive framework for describing and querying symbolic trajectories including an expressive pattern language as well as an efficient matching algorithm was presented lately. A symbolic trajectory, basically being a time-dependent symbolic value (e.g., a label), can contain names of traversed roads, a speed profile, transportation modes, behaviors of animals, or cells inside a cellular network. The quality and efficiency of transportation systems, targeted advertising, animal research, crime investigation, etc. may be improved by analyzing such data. The main contribution of this paper is an improvement of our previous approach, featuring algorithms and data structures optimizing the matching of symbolic trajectories for any kind of pattern with the help of two indexes. More specifically, a trie is applied for the symbolic values (i.e., labels or places), while the time intervals are stored in a one-dimensional R-tree. Hence, we avoid the linear scan of every trajectory, being necessary without index support. As a result, the computation cost for the pattern matching is nearly independent from the trajectory size. Our work details the concept and the implementation of the new approach, followed by an experimental evaluation.
{"title":"Index-supported pattern matching on symbolic trajectories","authors":"Fabio Valdés, R. H. Güting","doi":"10.1145/2666310.2666402","DOIUrl":"https://doi.org/10.1145/2666310.2666402","url":null,"abstract":"Recording mobility data with GPS-enabled devices, e.g., smart phones or vehicles, has become a common issue for private persons, companies, and institutions. Consequently, the requirements for managing these enormous datasets have increased drastically, so trajectory management has become an active research field. In order to avoid querying raw trajectories, which is neither convenient nor efficient, a symbolic representation of the geometric data has been introduced. A comprehensive framework for describing and querying symbolic trajectories including an expressive pattern language as well as an efficient matching algorithm was presented lately. A symbolic trajectory, basically being a time-dependent symbolic value (e.g., a label), can contain names of traversed roads, a speed profile, transportation modes, behaviors of animals, or cells inside a cellular network. The quality and efficiency of transportation systems, targeted advertising, animal research, crime investigation, etc. may be improved by analyzing such data. The main contribution of this paper is an improvement of our previous approach, featuring algorithms and data structures optimizing the matching of symbolic trajectories for any kind of pattern with the help of two indexes. More specifically, a trie is applied for the symbolic values (i.e., labels or places), while the time intervals are stored in a one-dimensional R-tree. Hence, we avoid the linear scan of every trajectory, being necessary without index support. As a result, the computation cost for the pattern matching is nearly independent from the trajectory size. Our work details the concept and the implementation of the new approach, followed by an experimental evaluation.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127588981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scalable spatial query processing relies on effective spatial data partitioning for query parallelization, data pruning, and load balancing. These are often challenged by the intrinsic characteristics of spatial data, such as high skew in data distribution and high complexity of irregular multi-dimensional objects. In this demo, we present SATO, a spatial data partitioning framework that can quickly analyze and partition spatial data with an optimal spatial partitioning strategy for scalable query processing. SATO works in following steps: 1) Sample, which samples a small fraction of input data for analysis, 2) Analyze, which quickly analyzes sampled data to find an optimal partition strategy, 3) Tear, which provides data skew aware partitioning and supports MapReduce based scalable partitioning, and 4) Optimize, which collects succinct partition statistics for potential query optimization. SATO also provides multiple level partitioning, which can be used to significantly improve window based queries in cloud based spatial query processing systems. SATO comes with a visualization component that provides heat maps and histograms for qualitative evaluation. SATO has been implemented within the Hadoop-GIS, a high performance spatial data warehousing system over MapReduce. SATO is also released as an independent software package to support various scalable spatial query processing systems. Our experiments have demonstrated that SATO can generate much balanced partitioning that can significantly improve spatial query performance with MapReduce comparing to traditional spatial partitioning approaches.
{"title":"SATO: a spatial data partitioning framework for scalable query processing","authors":"Hoang Vo, Ablimit Aji, Fusheng Wang","doi":"10.1145/2666310.2666365","DOIUrl":"https://doi.org/10.1145/2666310.2666365","url":null,"abstract":"Scalable spatial query processing relies on effective spatial data partitioning for query parallelization, data pruning, and load balancing. These are often challenged by the intrinsic characteristics of spatial data, such as high skew in data distribution and high complexity of irregular multi-dimensional objects. In this demo, we present SATO, a spatial data partitioning framework that can quickly analyze and partition spatial data with an optimal spatial partitioning strategy for scalable query processing. SATO works in following steps: 1) Sample, which samples a small fraction of input data for analysis, 2) Analyze, which quickly analyzes sampled data to find an optimal partition strategy, 3) Tear, which provides data skew aware partitioning and supports MapReduce based scalable partitioning, and 4) Optimize, which collects succinct partition statistics for potential query optimization. SATO also provides multiple level partitioning, which can be used to significantly improve window based queries in cloud based spatial query processing systems. SATO comes with a visualization component that provides heat maps and histograms for qualitative evaluation. SATO has been implemented within the Hadoop-GIS, a high performance spatial data warehousing system over MapReduce. SATO is also released as an independent software package to support various scalable spatial query processing systems. Our experiments have demonstrated that SATO can generate much balanced partitioning that can significantly improve spatial query performance with MapReduce comparing to traditional spatial partitioning approaches.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121471351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}