{"title":"物联网的可扩展分布式空间索引","authors":"A. Iyer, I. Stoica","doi":"10.1145/3127479.3132254","DOIUrl":null,"url":null,"abstract":"The increasing interest in the Internet-of-Things (IoT) suggests that a new source of big data is imminent---the machines and sensors in the IoT ecosystem. The fundamental characteristic of the data produced by these sources is that they are inherently geospatial in nature. In addition, they exhibit unprecedented and unpredictable skews. Thus, big data systems designed for IoT applications must be able to efficiently ingest, index and query spatial data having heavy and unpredictable skews. Spatial indexing is well explored area of research in literature, but little attention has been given to the topic of efficient distributed spatial indexing. In this paper, we propose Sift, a distributed spatial index and its implementation. Unlike systems that depend on load balancing mechanisms that kick-in post ingestion, Sift tries to distribute the incoming data along the distributed structure at indexing time and thus incurs minimal rebalancing overhead. Sift depends only on an underlying key-value store, hence is implementable in many existing big data stores. Our evaluations of Sift on a popular open source data store show promising results---Sift achieves up to 8× reduction in indexing overhead while simultaneously reducing the query latency and index size by over 2× and 3× respectively, in a distributed environment compared to the state-of-the-art.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"45 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"A scalable distributed spatial index for the internet-of-things\",\"authors\":\"A. Iyer, I. Stoica\",\"doi\":\"10.1145/3127479.3132254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing interest in the Internet-of-Things (IoT) suggests that a new source of big data is imminent---the machines and sensors in the IoT ecosystem. The fundamental characteristic of the data produced by these sources is that they are inherently geospatial in nature. In addition, they exhibit unprecedented and unpredictable skews. Thus, big data systems designed for IoT applications must be able to efficiently ingest, index and query spatial data having heavy and unpredictable skews. Spatial indexing is well explored area of research in literature, but little attention has been given to the topic of efficient distributed spatial indexing. In this paper, we propose Sift, a distributed spatial index and its implementation. Unlike systems that depend on load balancing mechanisms that kick-in post ingestion, Sift tries to distribute the incoming data along the distributed structure at indexing time and thus incurs minimal rebalancing overhead. Sift depends only on an underlying key-value store, hence is implementable in many existing big data stores. Our evaluations of Sift on a popular open source data store show promising results---Sift achieves up to 8× reduction in indexing overhead while simultaneously reducing the query latency and index size by over 2× and 3× respectively, in a distributed environment compared to the state-of-the-art.\",\"PeriodicalId\":20679,\"journal\":{\"name\":\"Proceedings of the 2017 Symposium on Cloud Computing\",\"volume\":\"45 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 Symposium on Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3127479.3132254\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 Symposium on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3127479.3132254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A scalable distributed spatial index for the internet-of-things
The increasing interest in the Internet-of-Things (IoT) suggests that a new source of big data is imminent---the machines and sensors in the IoT ecosystem. The fundamental characteristic of the data produced by these sources is that they are inherently geospatial in nature. In addition, they exhibit unprecedented and unpredictable skews. Thus, big data systems designed for IoT applications must be able to efficiently ingest, index and query spatial data having heavy and unpredictable skews. Spatial indexing is well explored area of research in literature, but little attention has been given to the topic of efficient distributed spatial indexing. In this paper, we propose Sift, a distributed spatial index and its implementation. Unlike systems that depend on load balancing mechanisms that kick-in post ingestion, Sift tries to distribute the incoming data along the distributed structure at indexing time and thus incurs minimal rebalancing overhead. Sift depends only on an underlying key-value store, hence is implementable in many existing big data stores. Our evaluations of Sift on a popular open source data store show promising results---Sift achieves up to 8× reduction in indexing overhead while simultaneously reducing the query latency and index size by over 2× and 3× respectively, in a distributed environment compared to the state-of-the-art.