Dong Xie, Feifei Li, Bin Yao, Gefei Li, Zhongpu Chen, Liang Zhou, M. Guo
{"title":"Simba: spatial in-memory big data analysis","authors":"Dong Xie, Feifei Li, Bin Yao, Gefei Li, Zhongpu Chen, Liang Zhou, M. Guo","doi":"10.1145/2996913.2996935","DOIUrl":null,"url":null,"abstract":"We present the Simba (Spatial In-Memory Big data Analytics) system, which offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. Simba natively extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and DataFrame API. It enables the construction of indexes over RDDs inside the engine in order to work with big spatial data and complex spatial operations. Simba also comes with an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput in big spatial data analysis. This demonstration proposal describes key ideas in the design of Simba, and presents a demonstration plan.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2996913.2996935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
We present the Simba (Spatial In-Memory Big data Analytics) system, which offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. Simba natively extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and DataFrame API. It enables the construction of indexes over RDDs inside the engine in order to work with big spatial data and complex spatial operations. Simba also comes with an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput in big spatial data analysis. This demonstration proposal describes key ideas in the design of Simba, and presents a demonstration plan.