首页 > 最新文献

Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems最新文献

英文 中文
Demand driven store site selection via multiple spatial-temporal data 通过多个时空数据进行需求驱动的店铺选址
Mengwen Xu, Tianyi Wang, Zhengwei Wu, Jingbo Zhou, Jian Li, Haishan Wu
Choosing a good location when opening a new store is crucial for the future success of a business. Traditional methods include offline manual survey, analytic models based on census data, which are either unable to adapt to the dynamic market or very time consuming. The rapid increase of the availability of big data from various types of mobile devices, such as online query data and offline positioning data, provides us with the possibility to develop automatic and accurate data- driven prediction models for business store site selection. In this paper, we propose a Demand Driven Store Site Selection (DD3S) framework for business store site selection by mining search query data from Baidu Maps. DD3S first detects the spatial-temporal distributions of customer demands on different business services via query data from Baidu Maps, the largest online map search engine in China, and detects the gaps between demand and supply. Then we determine candidate locations via clustering such gaps. In the final stage, we solve the location optimization problem by predicting and ranking the number of customers. We not only deploy supervised regression models to predict the number of customers, but also use learning-to-rank model to directly rank the locations. We evaluate our framework on various types of businesses in real-world cases, and the experiment results demonstrate the effectiveness of our methods. DD3S as the core function for store site selection has already been implemented as a core component of our business analytics platform and could be potentially used by chain store merchants on Baidu Nuomi.
选择一个好的地点开一家新店对未来的成功至关重要。传统的方法包括线下人工调查、基于普查数据的分析模型等,这些方法要么无法适应市场的动态变化,要么非常耗时。在线查询数据、离线定位数据等各类移动设备的大数据可用性的快速增加,为我们开发自动、准确的数据驱动的商业门店选址预测模型提供了可能。本文通过对百度地图搜索查询数据的挖掘,提出了一个需求驱动的店铺选址框架(DD3S)。DD3S首先通过中国最大的在线地图搜索引擎百度地图的查询数据,检测客户对不同业务服务需求的时空分布,并发现需求与供给之间的缺口。然后我们通过聚类这些间隙来确定候选位置。最后通过对客户数量的预测和排序来解决选址优化问题。我们不仅使用监督回归模型来预测顾客数量,还使用学习排序模型直接对位置进行排序。我们在实际案例中对不同类型的企业评估了我们的框架,实验结果证明了我们方法的有效性。DD3S作为店铺选址的核心功能,已经作为我们商业分析平台的核心组件实现,在百度糯米的连锁商家中具有潜在的使用潜力。
{"title":"Demand driven store site selection via multiple spatial-temporal data","authors":"Mengwen Xu, Tianyi Wang, Zhengwei Wu, Jingbo Zhou, Jian Li, Haishan Wu","doi":"10.1145/2996913.2996996","DOIUrl":"https://doi.org/10.1145/2996913.2996996","url":null,"abstract":"Choosing a good location when opening a new store is crucial for the future success of a business. Traditional methods include offline manual survey, analytic models based on census data, which are either unable to adapt to the dynamic market or very time consuming. The rapid increase of the availability of big data from various types of mobile devices, such as online query data and offline positioning data, provides us with the possibility to develop automatic and accurate data- driven prediction models for business store site selection. In this paper, we propose a Demand Driven Store Site Selection (DD3S) framework for business store site selection by mining search query data from Baidu Maps. DD3S first detects the spatial-temporal distributions of customer demands on different business services via query data from Baidu Maps, the largest online map search engine in China, and detects the gaps between demand and supply. Then we determine candidate locations via clustering such gaps. In the final stage, we solve the location optimization problem by predicting and ranking the number of customers. We not only deploy supervised regression models to predict the number of customers, but also use learning-to-rank model to directly rank the locations. We evaluate our framework on various types of businesses in real-world cases, and the experiment results demonstrate the effectiveness of our methods. DD3S as the core function for store site selection has already been implemented as a core component of our business analytics platform and could be potentially used by chain store merchants on Baidu Nuomi.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"321 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89236071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Assisted journey recollections from photo streams: (demo paper) 来自照片流的辅助旅行回忆:(演示纸)
Tyng-Ruey Chuang, Jheng-Peng Huang, Hsin-Huei Lee, Kae-An Liu, H. Syu
We extract GPS traces from photo streams and analyze them to reveal movement types. Speed and locale patterns hint about the kinds of activity as captured by the photos of the day. When properly categorized and visualized, the photos and their movement patterns help people in navigating the itineraries of their past, and in retreating images of possible highlights. Our method is tolerant of erroneous and missing positional information in the photos' metadata. External geospatial resources can be further combined and visualized with the itineraries and photos to assist people's recollection of the places they were visiting.
我们从照片流中提取GPS轨迹并分析它们以揭示运动类型。速度和地点模式暗示了当天照片所捕捉到的活动类型。经过适当的分类和可视化,这些照片和它们的运动模式可以帮助人们浏览他们过去的行程,并在可能的精彩画面中后退。我们的方法可以容忍照片元数据中错误和缺失的位置信息。外部地理空间资源可以与旅游路线和照片进一步结合和可视化,帮助人们回忆他们去过的地方。
{"title":"Assisted journey recollections from photo streams: (demo paper)","authors":"Tyng-Ruey Chuang, Jheng-Peng Huang, Hsin-Huei Lee, Kae-An Liu, H. Syu","doi":"10.1145/2996913.2996955","DOIUrl":"https://doi.org/10.1145/2996913.2996955","url":null,"abstract":"We extract GPS traces from photo streams and analyze them to reveal movement types. Speed and locale patterns hint about the kinds of activity as captured by the photos of the day. When properly categorized and visualized, the photos and their movement patterns help people in navigating the itineraries of their past, and in retreating images of possible highlights. Our method is tolerant of erroneous and missing positional information in the photos' metadata. External geospatial resources can be further combined and visualized with the itineraries and photos to assist people's recollection of the places they were visiting.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76242184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demonstrating PlanetSense: gathering geo-spatial intelligence from crowd-sourced and social-media data 展示PlanetSense:从众包和社交媒体数据中收集地理空间情报
Gautam Thakur, Kevin A. Sparks, Roger G. Li, R. Stewart, M. Urban
Crowd-sourced and volunteered information, social media, and participatory sensors are capable of providing real-time activity data. Monitoring these sources in time of relevance and then using them to gather operational knowledge is important during crisis management. Beyond that, it's important to curate this information for geo-spatial research purposes, including land use classification and population occupancy analysis. In this demonstration, we will showcase PlanetSense - a geo-spatial research platform built to harness the existing power of archived data and add to that, the dynamics of heterogeneous real-time streaming data from social media and volunteered sources, seamlessly integrated with sophisticated machine learning algorithms and visualization tools. A demonstration will focus on - 1) Recent initiative emphasizing the need to harness crowd-sources, volunteered, and social media data at scale; 2) Anatomy and insight into data collection workflow. We will show the ability to harvest and process several terabytes of raw data in real-time; 3) A detailed discussion with insight into more than 20 sources of data will be given. These sources include text, sensors, as well as imagery data; 4) PlanetSense's end to end distributed architecture will be discussed with focus on collecting and processing high-volumes of streaming data in a Geo-Data Cloud. Data fusion methods and algorithms for integrating disparate data sources with existing legacy products. Data analytics and machine learning methods for generating operational intelligence on the fly; 5) In addition, PlanetSense "App" platform will be shown with hands-on application enabling interested audience to quickly develop and deploy solutions. 6) Several case studies will be discussed relevant to, land use classification, monitoring transient population, high-resolution occupancy analysis, mapping special events population, ability to uncover global breaking events and reactions in near-real time, ability to track protest, unrest, and monitor other societal turbulences as they happen, and real-time monitoring of infrastructure outages.
众包和志愿者信息、社交媒体和参与式传感器能够提供实时活动数据。在危机管理期间,及时监测这些来源,然后利用它们收集操作知识非常重要。除此之外,为地理空间研究目的整理这些信息也很重要,包括土地利用分类和人口占用分析。在这次演示中,我们将展示PlanetSense——一个地理空间研究平台,旨在利用现有存档数据的力量,并添加来自社交媒体和自愿来源的异构实时流数据的动态,与复杂的机器学习算法和可视化工具无缝集成。演示将集中于- 1)最近的倡议,强调需要大规模利用群众资源、志愿者和社交媒体数据;2)对数据采集工作流程的剖析与洞察。我们将展示实时收集和处理数tb原始数据的能力;3)将对20多个数据来源进行详细讨论。这些来源包括文本、传感器以及图像数据;4) PlanetSense的端到端分布式架构将重点讨论在地理数据云中收集和处理大量流数据。用于将不同数据源与现有遗留产品集成的数据融合方法和算法。用于动态生成操作智能的数据分析和机器学习方法;5)此外,PlanetSense“App”平台将展示动手应用,使感兴趣的观众能够快速开发和部署解决方案。6)几个案例研究将讨论相关的,土地利用分类,监测流动人口,高分辨率占用分析,绘制特殊事件人口,近实时发现全球突发事件和反应的能力,跟踪抗议,骚乱和监测其他社会动荡的能力,以及实时监测基础设施中断。
{"title":"Demonstrating PlanetSense: gathering geo-spatial intelligence from crowd-sourced and social-media data","authors":"Gautam Thakur, Kevin A. Sparks, Roger G. Li, R. Stewart, M. Urban","doi":"10.1145/2996913.2996975","DOIUrl":"https://doi.org/10.1145/2996913.2996975","url":null,"abstract":"Crowd-sourced and volunteered information, social media, and participatory sensors are capable of providing real-time activity data. Monitoring these sources in time of relevance and then using them to gather operational knowledge is important during crisis management. Beyond that, it's important to curate this information for geo-spatial research purposes, including land use classification and population occupancy analysis. In this demonstration, we will showcase PlanetSense - a geo-spatial research platform built to harness the existing power of archived data and add to that, the dynamics of heterogeneous real-time streaming data from social media and volunteered sources, seamlessly integrated with sophisticated machine learning algorithms and visualization tools. A demonstration will focus on - 1) Recent initiative emphasizing the need to harness crowd-sources, volunteered, and social media data at scale; 2) Anatomy and insight into data collection workflow. We will show the ability to harvest and process several terabytes of raw data in real-time; 3) A detailed discussion with insight into more than 20 sources of data will be given. These sources include text, sensors, as well as imagery data; 4) PlanetSense's end to end distributed architecture will be discussed with focus on collecting and processing high-volumes of streaming data in a Geo-Data Cloud. Data fusion methods and algorithms for integrating disparate data sources with existing legacy products. Data analytics and machine learning methods for generating operational intelligence on the fly; 5) In addition, PlanetSense \"App\" platform will be shown with hands-on application enabling interested audience to quickly develop and deploy solutions. 6) Several case studies will be discussed relevant to, land use classification, monitoring transient population, high-resolution occupancy analysis, mapping special events population, ability to uncover global breaking events and reactions in near-real time, ability to track protest, unrest, and monitor other societal turbulences as they happen, and real-time monitoring of infrastructure outages.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83066568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MoTrIS: a framework for route planning on multimodal transportation networks MoTrIS:多式联运网络的路线规划框架
Theodoros Chondrogiannis, J. Gamper, R. Cavaliere, Patrick Ohnewein
In this paper, we present MoTrIS, a service-oriented framework which enables spatio-temporal query processing on multimodal networks that are composed of a road network and one or more schedule-based transportation networks. MoTrIS provides a remote access API, which allows for the development of applications that require the processing of routing queries on multimodal networks. We discuss the architecture of MoTrIS and we elaborate on each of its individual components. The data input module allows for the import of data from various sources into a spatial-enabled relational database. The network module builds a multimodal network by combining a road network with one or more transportation networks. The timetable module stores and queries the schedule for each transportation mode. The query processing module enables the execution of queries over the multimodal network. The visualization module exports the results into a visualizable format. Finally, we present a web application which allows users to create, modify and test advanced spatio-temporal services, and we demonstrate all the necessary steps for a user to build such a new service.
在本文中,我们提出了MoTrIS,这是一个面向服务的框架,可以对由道路网络和一个或多个基于时间表的交通网络组成的多模式网络进行时空查询处理。MoTrIS提供了一个远程访问API,允许开发需要在多模式网络上处理路由查询的应用程序。我们将讨论MoTrIS的体系结构,并详细说明其每个单独的组件。数据输入模块允许将来自不同来源的数据导入到支持空间的关系数据库中。网络模块通过将道路网络与一个或多个交通网络相结合来构建多式联运网络。时间表模块存储和查询每种运输模式的时间表。查询处理模块支持在多模式网络上执行查询。可视化模块将结果导出为可可视化的格式。最后,我们展示了一个允许用户创建、修改和测试高级时空服务的web应用程序,并演示了用户构建这样一个新服务的所有必要步骤。
{"title":"MoTrIS: a framework for route planning on multimodal transportation networks","authors":"Theodoros Chondrogiannis, J. Gamper, R. Cavaliere, Patrick Ohnewein","doi":"10.1145/2996913.2997007","DOIUrl":"https://doi.org/10.1145/2996913.2997007","url":null,"abstract":"In this paper, we present MoTrIS, a service-oriented framework which enables spatio-temporal query processing on multimodal networks that are composed of a road network and one or more schedule-based transportation networks. MoTrIS provides a remote access API, which allows for the development of applications that require the processing of routing queries on multimodal networks. We discuss the architecture of MoTrIS and we elaborate on each of its individual components. The data input module allows for the import of data from various sources into a spatial-enabled relational database. The network module builds a multimodal network by combining a road network with one or more transportation networks. The timetable module stores and queries the schedule for each transportation mode. The query processing module enables the execution of queries over the multimodal network. The visualization module exports the results into a visualizable format. Finally, we present a web application which allows users to create, modify and test advanced spatio-temporal services, and we demonstrate all the necessary steps for a user to build such a new service.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"93 10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91078877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Data depth based clustering analysis 基于数据深度的聚类分析
Myeong-Hun Jeong, Yaping Cai, C. Sullivan, Shaowen Wang
This paper proposes a new algorithm for identifying patterns within data, based on data depth. Such a clustering analysis has an enormous potential to discover previously unknown insights from existing data sets. Many clustering algorithms already exist for this purpose. However, most algorithms are not affine invariant. Therefore, they must operate with different parameters after the data sets are rotated, scaled, or translated. Further, most clustering algorithms, based on Euclidean distance, can be sensitive to noises because they have no global perspective. Parameter selection also significantly affects the clustering results of each algorithm. Unlike many existing clustering algorithms, the proposed algorithm, called data depth based clustering analysis (DBCA), is able to detect coherent clusters after the data sets are affine transformed without changing a parameter. It is also robust to noises because using data depth can measure centrality and outlyingness of the underlying data. Further, it can generate relatively stable clusters by varying the parameter. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed algorithm outperforms DBSCAN and HDBSCAN in terms of affine invariance, and exceeds or matches the ro-bustness to noises of DBSCAN or HDBSCAN. The robust-ness to parameter selection is also demonstrated through the case study of clustering twitter data.
本文提出了一种基于数据深度的数据模式识别算法。这种聚类分析具有巨大的潜力,可以从现有数据集中发现以前未知的见解。为此目的已经存在许多聚类算法。然而,大多数算法都不是仿射不变的。因此,在数据集被旋转、缩放或转换后,它们必须使用不同的参数进行操作。此外,大多数基于欧几里得距离的聚类算法由于没有全局视角,对噪声比较敏感。参数选择对各算法的聚类结果也有显著影响。与许多现有的聚类算法不同,该算法被称为基于数据深度的聚类分析(DBCA),它能够在数据集进行仿射变换后检测到相干聚类,而无需改变参数。由于使用数据深度可以测量底层数据的中心性和离群性,因此它对噪声也具有鲁棒性。此外,通过改变参数可以生成相对稳定的簇。实验结果表明,该算法在仿射不变性方面优于DBSCAN和HDBSCAN,并且超过或匹配DBSCAN或HDBSCAN对噪声的抗噪能力。通过对twitter数据的聚类分析,验证了该方法对参数选择的鲁棒性。
{"title":"Data depth based clustering analysis","authors":"Myeong-Hun Jeong, Yaping Cai, C. Sullivan, Shaowen Wang","doi":"10.1145/2996913.2996984","DOIUrl":"https://doi.org/10.1145/2996913.2996984","url":null,"abstract":"This paper proposes a new algorithm for identifying patterns within data, based on data depth. Such a clustering analysis has an enormous potential to discover previously unknown insights from existing data sets. Many clustering algorithms already exist for this purpose. However, most algorithms are not affine invariant. Therefore, they must operate with different parameters after the data sets are rotated, scaled, or translated. Further, most clustering algorithms, based on Euclidean distance, can be sensitive to noises because they have no global perspective. Parameter selection also significantly affects the clustering results of each algorithm. Unlike many existing clustering algorithms, the proposed algorithm, called data depth based clustering analysis (DBCA), is able to detect coherent clusters after the data sets are affine transformed without changing a parameter. It is also robust to noises because using data depth can measure centrality and outlyingness of the underlying data. Further, it can generate relatively stable clusters by varying the parameter. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed algorithm outperforms DBSCAN and HDBSCAN in terms of affine invariance, and exceeds or matches the ro-bustness to noises of DBSCAN or HDBSCAN. The robust-ness to parameter selection is also demonstrated through the case study of clustering twitter data.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"92 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83593346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A spatio-temporal, Gaussian process regression, real-estate price predictor 一个时空高斯过程回归,房地产价格预测器
Henry Crosby, Paul Davis, T. Damoulas, S. Jarvis
This paper introduces a novel four-stage methodology for real-estate valuation. This research shows that space, property, economic, neighbourhood and time features are all contributing factors in producing a house price predictor in which validation shows a 96.6% accuracy on Gaussian Process Regression beating regression-kriging, random forests and an M5P-decision-tree. The output is integrated into a commercial real estate decision engine.
本文介绍了一种新的四阶段房地产估价方法。本研究表明,空间、财产、经济、邻里和时间特征都是产生房价预测器的因素,其中验证表明高斯过程回归的准确率为96.6%,超过了回归-克里格、随机森林和m5p决策树。将输出集成到商业房地产决策引擎中。
{"title":"A spatio-temporal, Gaussian process regression, real-estate price predictor","authors":"Henry Crosby, Paul Davis, T. Damoulas, S. Jarvis","doi":"10.1145/2996913.2996960","DOIUrl":"https://doi.org/10.1145/2996913.2996960","url":null,"abstract":"This paper introduces a novel four-stage methodology for real-estate valuation. This research shows that space, property, economic, neighbourhood and time features are all contributing factors in producing a house price predictor in which validation shows a 96.6% accuracy on Gaussian Process Regression beating regression-kriging, random forests and an M5P-decision-tree. The output is integrated into a commercial real estate decision engine.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"387 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77674772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Towards interactive analytics and visualization on one billion tweets 迈向10亿条推文的交互式分析和可视化
Jianfeng Jia, Chen Li, Xi Zhang, Chen Li, M. Carey, Simon Su
We present a system called "Cloudberry" that allows users to interactively query, analyze, and visualize large amounts of data with temporal, spatial, and textual dimensions. As a general-purpose full-stack solution, it has a friendly UI, intelligent middleware, and a powerful big data management backend running Apache AsterixDB to enable big data analytics and visualization. We will demonstrate the system using Twitter data on a computer cluster.
我们提出了一个名为“Cloudberry”的系统,它允许用户交互式地查询、分析和可视化大量的时间、空间和文本维度的数据。作为一个通用的全栈解决方案,它有一个友好的UI,智能的中间件,以及一个运行Apache AsterixDB的强大的大数据管理后端,以实现大数据分析和可视化。我们将在计算机集群上使用Twitter数据演示该系统。
{"title":"Towards interactive analytics and visualization on one billion tweets","authors":"Jianfeng Jia, Chen Li, Xi Zhang, Chen Li, M. Carey, Simon Su","doi":"10.1145/2996913.2996923","DOIUrl":"https://doi.org/10.1145/2996913.2996923","url":null,"abstract":"We present a system called \"Cloudberry\" that allows users to interactively query, analyze, and visualize large amounts of data with temporal, spatial, and textual dimensions. As a general-purpose full-stack solution, it has a friendly UI, intelligent middleware, and a powerful big data management backend running Apache AsterixDB to enable big data analytics and visualization. We will demonstrate the system using Twitter data on a computer cluster.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83505868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Differentially private publication of location entropy 位置熵的差分私有发布
Hien To, Kien Nguyen, C. Shahabi
Location entropy (LE) is a popular metric for measuring the popularity of various locations (e.g., points-of-interest). Unlike other metrics computed from only the number of (unique) visits to a location, namely frequency, LE also captures the diversity of the users' visits, and is thus more accurate than other metrics. Current solutions for computing LE require full access to the past visits of users to locations, which poses privacy threats. This paper discusses, for the first time, the problem of perturbing location entropy for a set of locations according to differential privacy. The problem is challenging because removing a single user from the dataset will impact multiple records of the database; i.e., all the visits made by that user to various locations. Towards this end, we first derive non-trivial, tight bounds for both local and global sensitivity of LE, and show that to satisfy ε-differential privacy, a large amount of noise must be introduced, rendering the published results useless. Hence, we propose a thresholding technique to limit the number of users' visits, which significantly reduces the perturbation error but introduces an approximation error. To achieve better utility, we extend the technique by adopting two weaker notions of privacy: smooth sensitivity (slightly weaker) and crowd-blending (strictly weaker). Extensive experiments on synthetic and real-world datasets show that our proposed techniques preserve original data distribution without compromising location privacy.
位置熵(Location entropy, LE)是衡量不同位置(如兴趣点)受欢迎程度的常用度量标准。与其他仅根据(唯一的)访问次数(即频率)计算的指标不同,LE还捕获了用户访问的多样性,因此比其他指标更准确。当前计算LE的解决方案需要完全访问用户过去对位置的访问,这构成了隐私威胁。本文首次讨论了基于微分隐私的一组位置熵的扰动问题。这个问题是具有挑战性的,因为从数据集中删除单个用户将影响数据库的多条记录;例如,该用户对不同地点的所有访问。为此,我们首先推导了LE的局部和全局灵敏度的非平凡的紧界,并表明为了满足ε-微分隐私,必须引入大量的噪声,使已发表的结果无效。因此,我们提出了一种阈值技术来限制用户的访问次数,这大大减少了扰动误差,但引入了近似误差。为了获得更好的效用,我们通过采用两个较弱的隐私概念来扩展该技术:平滑敏感性(稍弱)和人群混合(严格较弱)。在合成和现实世界数据集上的大量实验表明,我们提出的技术在不损害位置隐私的情况下保留了原始数据分布。
{"title":"Differentially private publication of location entropy","authors":"Hien To, Kien Nguyen, C. Shahabi","doi":"10.1145/2996913.2996985","DOIUrl":"https://doi.org/10.1145/2996913.2996985","url":null,"abstract":"Location entropy (LE) is a popular metric for measuring the popularity of various locations (e.g., points-of-interest). Unlike other metrics computed from only the number of (unique) visits to a location, namely frequency, LE also captures the diversity of the users' visits, and is thus more accurate than other metrics. Current solutions for computing LE require full access to the past visits of users to locations, which poses privacy threats. This paper discusses, for the first time, the problem of perturbing location entropy for a set of locations according to differential privacy. The problem is challenging because removing a single user from the dataset will impact multiple records of the database; i.e., all the visits made by that user to various locations. Towards this end, we first derive non-trivial, tight bounds for both local and global sensitivity of LE, and show that to satisfy ε-differential privacy, a large amount of noise must be introduced, rendering the published results useless. Hence, we propose a thresholding technique to limit the number of users' visits, which significantly reduces the perturbation error but introduces an approximation error. To achieve better utility, we extend the technique by adopting two weaker notions of privacy: smooth sensitivity (slightly weaker) and crowd-blending (strictly weaker). Extensive experiments on synthetic and real-world datasets show that our proposed techniques preserve original data distribution without compromising location privacy.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88234887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Large-scale geolocalization of overhead imagery 高架图像的大规模地理定位
Mehul Divecha, S. Newsam
In this paper, we investigate state-of-the-art computer vision techniques to perform large scale geolocalization of overhead imagery through image matching. We consider two types of features: scale invariant feature transform and region-based shape features. Since these features can be high dimensional and an image can contain many of them, using them to perform image matching can be computationally expensive. Therefore, we also investigate two methods for performing efficient matching: aggregating the features at the image level using a bag of words framework and using hashing to perform multiple, efficient matches and then aggregating the results. We show that hashing performs better in terms of accuracy but is expensive computationally compared to bag of words. We also show that shape features may be accurate and efficient for small data sets, but they do not scale well to large data sets.
在本文中,我们研究了最先进的计算机视觉技术,通过图像匹配对架空图像进行大规模的地理定位。我们考虑两种类型的特征:尺度不变特征变换和基于区域的形状特征。由于这些特征可能是高维的,并且图像可以包含许多特征,因此使用它们来执行图像匹配在计算上可能会很昂贵。因此,我们还研究了两种执行高效匹配的方法:使用单词包框架在图像级别聚合特征,以及使用哈希执行多次高效匹配,然后聚合结果。我们表明,哈希在准确性方面表现更好,但与单词包相比,计算成本较高。我们还表明,形状特征对于小数据集可能是准确和有效的,但它们不能很好地扩展到大数据集。
{"title":"Large-scale geolocalization of overhead imagery","authors":"Mehul Divecha, S. Newsam","doi":"10.1145/2996913.2996980","DOIUrl":"https://doi.org/10.1145/2996913.2996980","url":null,"abstract":"In this paper, we investigate state-of-the-art computer vision techniques to perform large scale geolocalization of overhead imagery through image matching. We consider two types of features: scale invariant feature transform and region-based shape features. Since these features can be high dimensional and an image can contain many of them, using them to perform image matching can be computationally expensive. Therefore, we also investigate two methods for performing efficient matching: aggregating the features at the image level using a bag of words framework and using hashing to perform multiple, efficient matches and then aggregating the results. We show that hashing performs better in terms of accuracy but is expensive computationally compared to bag of words. We also show that shape features may be accurate and efficient for small data sets, but they do not scale well to large data sets.","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88717294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
CrimeStand: spatial tracking of criminal activity 犯罪站:犯罪活动的空间跟踪
Faizan Wajid, H. Samet
Pursuing criminal activity is tied with understanding illegal or unlawful actions taken on opportunity within a geographic location. Mapping such activities can aid significantly in determining the health of a region, and the vicissitudes of civilian life. Methods to track crime and criminal activity after the fact by mapping news reports of it to geographic locations using the NewsStand system are discussed. NewsStand provides a map-query interface to monitor over 10,000 RSS news sources and making them available within minutes after publication. NewsStand was designed to collect event data given keywords centered on locations specified textually and mapping these locations to their spatial representation, a procedure called geotagging. The goal is to demonstrate how to detect and classify criminal activity by geotagging keywords pertaining to crime, and, in effect, to enhance the capabilities of NewsStand to explicitly show this category of news. The resulting system is named "CrimeStand".
追查犯罪活动与了解在一个地理位置内偶然发生的非法或非法行为有关。绘制这类活动的地图可以大大有助于确定一个区域的健康状况和平民生活的变迁。讨论了利用报摊系统将新闻报道映射到地理位置来跟踪犯罪和犯罪活动的方法。NewsStand提供了一个地图查询接口,可以监控超过10,000个RSS新闻源,并在发布后几分钟内使它们可用。NewsStand的设计目的是收集以文本指定位置为中心的关键字的事件数据,并将这些位置映射到它们的空间表示,这一过程称为地理标记。我们的目标是演示如何通过对与犯罪相关的关键字进行地理标记来检测和分类犯罪活动,实际上是增强NewsStand明确显示这类新闻的能力。由此产生的系统被命名为“CrimeStand”。
{"title":"CrimeStand: spatial tracking of criminal activity","authors":"Faizan Wajid, H. Samet","doi":"10.1145/2996913.2997006","DOIUrl":"https://doi.org/10.1145/2996913.2997006","url":null,"abstract":"Pursuing criminal activity is tied with understanding illegal or unlawful actions taken on opportunity within a geographic location. Mapping such activities can aid significantly in determining the health of a region, and the vicissitudes of civilian life. Methods to track crime and criminal activity after the fact by mapping news reports of it to geographic locations using the NewsStand system are discussed. NewsStand provides a map-query interface to monitor over 10,000 RSS news sources and making them available within minutes after publication. NewsStand was designed to collect event data given keywords centered on locations specified textually and mapping these locations to their spatial representation, a procedure called geotagging. The goal is to demonstrate how to detect and classify criminal activity by geotagging keywords pertaining to crime, and, in effect, to enhance the capabilities of NewsStand to explicitly show this category of news. The resulting system is named \"CrimeStand\".","PeriodicalId":20525,"journal":{"name":"Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"403 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87803159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1