首页 > 最新文献

Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data最新文献

英文 中文
An empirical study of workers' behavior in spatial crowdsourcing 空间众包中员工行为的实证研究
Hien To, Rúben Geraldes, C. Shahabi, S. H. Kim, H. Prendinger
With the ubiquity of smartphones, spatial crowdsourcing (SC) has emerged as a new paradigm that engages mobile users to perform tasks in the physical world. Thus, various SC techniques have been studied for performance optimization. However, little research has been done to understand workers' behavior in the real world. In this study, we designed and performed two real world SC campaigns utilizing our mobile app, called Genkii, which is a GPS-enabled app for users to report their affective state (e.g., happy, sad). We used Yahoo! Japan Crowdsourcing as the payment platform to reward users for reporting their affective states at different locations and times. We studied the relationship between incentives and participation by analyzing the impact of offering a fixed reward versus an increasing reward scheme. We observed that users tend to stay in a campaign longer when the provided incentives gradually increase over time. We also found that the degree of mobility is correlated with the reported information. For example, users who travel more are observed to be happier than the ones who travel less. Furthermore, analyzing the spatiotemporal information of the reports reveals interesting mobility patterns that are unique to spatial crowdsourcing.
随着智能手机的普及,空间众包(SC)已经成为一种新的范例,它让移动用户在现实世界中执行任务。因此,研究了各种SC技术以实现性能优化。然而,很少有研究来了解工人在现实世界中的行为。在这项研究中,我们利用我们的移动应用程序Genkii设计并执行了两个真实世界的SC活动,Genkii是一个启用gps的应用程序,供用户报告他们的情感状态(例如,快乐,悲伤)。我们使用Yahoo!日本Crowdsourcing作为支付平台,奖励用户在不同地点和时间报告自己的情感状态。我们通过分析提供固定奖励与增加奖励方案的影响,研究了激励与参与之间的关系。我们观察到,随着时间的推移,当提供的奖励逐渐增加时,用户倾向于在活动中停留更长时间。我们还发现,流动性的程度与报告的信息相关。例如,人们观察到,经常旅行的用户比经常旅行的用户更快乐。此外,分析报告的时空信息揭示了空间众包独特的有趣的移动模式。
{"title":"An empirical study of workers' behavior in spatial crowdsourcing","authors":"Hien To, Rúben Geraldes, C. Shahabi, S. H. Kim, H. Prendinger","doi":"10.1145/2948649.2948657","DOIUrl":"https://doi.org/10.1145/2948649.2948657","url":null,"abstract":"With the ubiquity of smartphones, spatial crowdsourcing (SC) has emerged as a new paradigm that engages mobile users to perform tasks in the physical world. Thus, various SC techniques have been studied for performance optimization. However, little research has been done to understand workers' behavior in the real world. In this study, we designed and performed two real world SC campaigns utilizing our mobile app, called Genkii, which is a GPS-enabled app for users to report their affective state (e.g., happy, sad). We used Yahoo! Japan Crowdsourcing as the payment platform to reward users for reporting their affective states at different locations and times. We studied the relationship between incentives and participation by analyzing the impact of offering a fixed reward versus an increasing reward scheme. We observed that users tend to stay in a campaign longer when the provided incentives gradually increase over time. We also found that the degree of mobility is correlated with the reported information. For example, users who travel more are observed to be happier than the ones who travel less. Furthermore, analyzing the spatiotemporal information of the reports reveals interesting mobility patterns that are unique to spatial crowdsourcing.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133534890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Taming twisted cubes 驯服扭曲的立方体
P. Baumann, E. Hirschorn, J. Masó, A. Dumitru, Vlad Merticariu
Spatio-temporal grid data form a core structure in Earth and Space sciences alike. While Array Databases have set out to support this information category they only offer integer indexing, corresponding to equidistant grids. However, often grids in reality have irregular structures, such as raw satellite swath data. We present an approach to modeling spatio-temporal regular and non-regular grids in a coherent manner, suitable for querying, transporting, and storing such data while remaining format independent. We briefly describe an implementation based on the combination of a relational and an array DBMS. Our model is currently under adoption as an international standard by OGC and ISO.
时空网格数据构成了地球科学和空间科学的核心结构。虽然Array数据库已经开始支持这种信息类别,但它们只提供整数索引,对应于等距网格。然而,在现实中,网格通常具有不规则的结构,例如原始卫星条带数据。我们提出了一种以连贯的方式对时空规则和非规则网格建模的方法,适用于查询、传输和存储这些数据,同时保持格式独立。我们简要地描述了一种基于关系DBMS和数组DBMS组合的实现。我们的模式目前正被OGC和ISO采纳为国际标准。
{"title":"Taming twisted cubes","authors":"P. Baumann, E. Hirschorn, J. Masó, A. Dumitru, Vlad Merticariu","doi":"10.1145/2948649.2948650","DOIUrl":"https://doi.org/10.1145/2948649.2948650","url":null,"abstract":"Spatio-temporal grid data form a core structure in Earth and Space sciences alike. While Array Databases have set out to support this information category they only offer integer indexing, corresponding to equidistant grids. However, often grids in reality have irregular structures, such as raw satellite swath data. We present an approach to modeling spatio-temporal regular and non-regular grids in a coherent manner, suitable for querying, transporting, and storing such data while remaining format independent. We briefly describe an implementation based on the combination of a relational and an array DBMS. Our model is currently under adoption as an international standard by OGC and ISO.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125520500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Prediction of user app usage behavior from geo-spatial data 根据地理空间数据预测用户应用程序的使用行为
Xiao-Xing Zhao, Yuanyuan Qiao, Zhongwei Si, Jie Yang, Anders Lindgren
In the era of mobile Internet, a vast amount of geo-spatial data allows us to gain further insights into human activities, which is critical for Internet Services Providers (ISP) to provide better personalized services. With the pervasiveness of mobile Internet, much evidence show that human mobility has heavy impact on app usage behavior. In this paper, we propose a method based on machine learning to predict users' app usage behavior using several features of human mobility extracted from geo-spatial data in mobile Internet traces. The core idea of our method is selecting a set of mobility attributes (e.g. location, travel pattern, and mobility indicators) that have large impact on app usage behavior and inputting them into a classification model. We evaluate our method using real-world network traffic collected by our self-developed high-speed Traffic Monitoring System (TMS). Our prediction method achieves 90.3% accuracy in our experiment, which verifies the strong correlation between human mobility and app usage behavior. Our experimental results uncover a big potential of geo-spatial data extracted from mobile Internet.
在移动互联网时代,海量的地理空间数据可以让我们更深入地了解人类活动,这对于互联网服务提供商(ISP)提供更好的个性化服务至关重要。随着移动互联网的普及,大量证据表明,人类的移动性对应用程序的使用行为产生了重大影响。在本文中,我们提出了一种基于机器学习的方法,利用从移动互联网痕迹的地理空间数据中提取的人类移动性的几个特征来预测用户的应用程序使用行为。我们方法的核心思想是选择一组对应用使用行为影响较大的移动性属性(如位置、出行模式、移动性指标),并将其输入到分类模型中。我们使用我们自己开发的高速流量监控系统(TMS)收集的真实网络流量来评估我们的方法。我们的预测方法在我们的实验中达到了90.3%的准确率,验证了人类移动性与应用程序使用行为之间的强相关性。我们的实验结果揭示了从移动互联网中提取地理空间数据的巨大潜力。
{"title":"Prediction of user app usage behavior from geo-spatial data","authors":"Xiao-Xing Zhao, Yuanyuan Qiao, Zhongwei Si, Jie Yang, Anders Lindgren","doi":"10.1145/2948649.2948656","DOIUrl":"https://doi.org/10.1145/2948649.2948656","url":null,"abstract":"In the era of mobile Internet, a vast amount of geo-spatial data allows us to gain further insights into human activities, which is critical for Internet Services Providers (ISP) to provide better personalized services. With the pervasiveness of mobile Internet, much evidence show that human mobility has heavy impact on app usage behavior. In this paper, we propose a method based on machine learning to predict users' app usage behavior using several features of human mobility extracted from geo-spatial data in mobile Internet traces. The core idea of our method is selecting a set of mobility attributes (e.g. location, travel pattern, and mobility indicators) that have large impact on app usage behavior and inputting them into a classification model. We evaluate our method using real-world network traffic collected by our self-developed high-speed Traffic Monitoring System (TMS). Our prediction method achieves 90.3% accuracy in our experiment, which verifies the strong correlation between human mobility and app usage behavior. Our experimental results uncover a big potential of geo-spatial data extracted from mobile Internet.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121098539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Geodata supported classification of patent applications 地理数据支持专利申请的分类
J. Stutzki, Matthias Schubert
The automatic classification of patent applications into a particular patent classification system remains a challenge with many practical applications. From a computer science point of view, the task is a multi-label hierarchical classification problem, i.e. each patent application might belong to multiple classes within the class hierarchy. The problem is still especially difficult for purely text-based classifiers because patents and patent applications are often formulated in a rather generic way. Thus, additional sources of information should be used to improve class prediction. In our approach, we propose the use of location information contained in the meta data of a patent application in combination with text-based patent classification. We argue that certain technological areas often cluster in geographic regions. For example, space travel technology is often collocated at Houston, Texas due to the NASA facilities in this area. In many cases, the addresses of the inventors are correlated to the technological area of a given patent. Thus, the addresses can be exploited to provide additional information about the technological area. We present a geo-enriched classifier joining established methods for text-based classification with location-based topic prediction. Since the location-based prediction is not applicable to all cases, we provide a method to regulate the impact of the spatial predictor for these cases. Our experiments indicate that spatial prediction is applicable to a considerable amount of patent applications and that the combination of spatial prediction and text-based classification significantly improves the prediction accuracy.
在许多实际应用中,将专利申请自动分类到特定的专利分类系统仍然是一个挑战。从计算机科学的角度来看,该任务是一个多标签分层分类问题,即每个专利申请可能属于类层次中的多个类。对于纯粹基于文本的分类器来说,这个问题仍然特别困难,因为专利和专利申请通常是以相当通用的方式制定的。因此,应该使用额外的信息源来改进类预测。在我们的方法中,我们建议将包含在专利申请元数据中的位置信息与基于文本的专利分类相结合。我们认为,某些技术领域往往聚集在地理区域。例如,由于美国国家航空航天局在该地区的设施,太空旅行技术经常被安置在德克萨斯州的休斯顿。在许多情况下,发明人的地址与给定专利的技术领域相关。因此,可以利用这些地址来提供有关该技术领域的附加信息。我们提出了一种地理富集分类器,将基于文本的分类方法与基于位置的主题预测方法结合起来。由于基于位置的预测并不适用于所有的情况,我们提供了一种方法来调节空间预测器对这些情况的影响。我们的实验表明,空间预测适用于相当数量的专利申请,并且空间预测与基于文本的分类相结合显著提高了预测精度。
{"title":"Geodata supported classification of patent applications","authors":"J. Stutzki, Matthias Schubert","doi":"10.1145/2948649.2948653","DOIUrl":"https://doi.org/10.1145/2948649.2948653","url":null,"abstract":"The automatic classification of patent applications into a particular patent classification system remains a challenge with many practical applications. From a computer science point of view, the task is a multi-label hierarchical classification problem, i.e. each patent application might belong to multiple classes within the class hierarchy. The problem is still especially difficult for purely text-based classifiers because patents and patent applications are often formulated in a rather generic way. Thus, additional sources of information should be used to improve class prediction. In our approach, we propose the use of location information contained in the meta data of a patent application in combination with text-based patent classification. We argue that certain technological areas often cluster in geographic regions. For example, space travel technology is often collocated at Houston, Texas due to the NASA facilities in this area. In many cases, the addresses of the inventors are correlated to the technological area of a given patent. Thus, the addresses can be exploited to provide additional information about the technological area. We present a geo-enriched classifier joining established methods for text-based classification with location-based topic prediction. Since the location-based prediction is not applicable to all cases, we provide a method to regulate the impact of the spatial predictor for these cases. Our experiments indicate that spatial prediction is applicable to a considerable amount of patent applications and that the combination of spatial prediction and text-based classification significantly improves the prediction accuracy.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126641234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Geo-fingerprinting social media content 地理指纹识别社交媒体内容
Hatim Gazaz, A. Croitoru, P. Delamater, D. Pfoser
With the percentage of Twitter users approaching 20% of the US population by 2019, tweets provide a good sample of the public's sentiment and opinion. Consequently such data has been excessively used in commercial and research efforts. While works have analyzed the content of tweets in relation to the underlying social network of a discussion, somewhat less attention has been paid to the spatial distribution of messages and topics. This work tries to assess the locality of discussions using the concepts mentioned in tweets. Based on a global distribution of topics across the 48 contiguous states, we try to ascertain spatial topic dissimilarity by recursively subdividing the space into smaller and smaller partitions and using statistical testing to compare the distributions. Experimenting with a large Twitter dataset for the US, we can observe that locality of a discussion occurs at specific thresholds and that only 14 of the 49 most populous urban areas feature a unique discussion. Overall, this work establishes trends as to when locality in a discussion in social media occurs.
到2019年,推特用户的比例将接近美国人口的20%,推特提供了一个很好的公众情绪和观点样本。因此,这些数据在商业和研究工作中被过度使用。虽然已有研究分析了推文的内容与讨论的潜在社交网络的关系,但对信息和主题的空间分布的关注较少。这项工作试图使用推文中提到的概念来评估讨论的局部性。基于48个相邻州的全球主题分布,我们试图通过递归地将空间细分为越来越小的分区,并使用统计测试来比较分布来确定空间主题的不相似性。对美国的大型Twitter数据集进行实验,我们可以观察到讨论的局域性发生在特定的阈值上,并且49个人口最多的城市地区中只有14个具有独特的讨论。总的来说,这项工作确定了社交媒体讨论中何时发生局部性的趋势。
{"title":"Geo-fingerprinting social media content","authors":"Hatim Gazaz, A. Croitoru, P. Delamater, D. Pfoser","doi":"10.1145/2948649.2948654","DOIUrl":"https://doi.org/10.1145/2948649.2948654","url":null,"abstract":"With the percentage of Twitter users approaching 20% of the US population by 2019, tweets provide a good sample of the public's sentiment and opinion. Consequently such data has been excessively used in commercial and research efforts. While works have analyzed the content of tweets in relation to the underlying social network of a discussion, somewhat less attention has been paid to the spatial distribution of messages and topics. This work tries to assess the locality of discussions using the concepts mentioned in tweets. Based on a global distribution of topics across the 48 contiguous states, we try to ascertain spatial topic dissimilarity by recursively subdividing the space into smaller and smaller partitions and using statistical testing to compare the distributions. Experimenting with a large Twitter dataset for the US, we can observe that locality of a discussion occurs at specific thresholds and that only 14 of the 49 most populous urban areas feature a unique discussion. Overall, this work establishes trends as to when locality in a discussion in social media occurs.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116162429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GeoSocialBound: an efficient framework for estimating social POI boundaries using spatio--textual information GeoSocialBound:一个使用空间文本信息估计社会POI边界的有效框架
Dung D. Vu, Hien To, Won-Yong Shin, C. Shahabi
In this paper, we present a novel framework for estimating social point-of-interest (POI) boundaries, also termed GeoSocialBound, utilizing spatio--textual information based on geo-tagged tweets. We first start by defining a social POI boundary as one small-scale cluster containing its POI center, geographically formed with a convex polygon. Motivated by an insightful observation with regard to estimation accuracy, we formulate a constrained optimization problem, in which we are interested in finding the radius of a circle such that a newly defined objective function is maximized. To solve this problem, we introduce an efficient optimal estimation algorithm whose runtime complexity is linear in the number of geo-tags in a dataset. In addition, we empirically evaluate the estimation performance of our GeoSocialBound algorithm for various environments and validate the complexity analysis. As a result, vital information on how to obtain real-world GeoSocialBounds with a high degree of accuracy is provided.
在本文中,我们提出了一个新的框架来估计社会兴趣点(POI)边界,也称为GeoSocialBound,利用基于地理标记的推文的空间文本信息。我们首先将社会POI边界定义为一个包含其POI中心的小规模集群,在地理上由凸多边形形成。出于对估计精度的深刻观察,我们提出了一个约束优化问题,其中我们感兴趣的是找到一个圆的半径,从而使新定义的目标函数最大化。为了解决这一问题,我们引入了一种高效的最优估计算法,该算法的运行复杂度与数据集中地理标签的数量呈线性关系。此外,我们还通过经验评估了GeoSocialBound算法在各种环境下的估计性能,并验证了复杂性分析。因此,提供了关于如何以高精度获得真实世界GeoSocialBounds的重要信息。
{"title":"GeoSocialBound: an efficient framework for estimating social POI boundaries using spatio--textual information","authors":"Dung D. Vu, Hien To, Won-Yong Shin, C. Shahabi","doi":"10.1145/2948649.2948652","DOIUrl":"https://doi.org/10.1145/2948649.2948652","url":null,"abstract":"In this paper, we present a novel framework for estimating social point-of-interest (POI) boundaries, also termed GeoSocialBound, utilizing spatio--textual information based on geo-tagged tweets. We first start by defining a social POI boundary as one small-scale cluster containing its POI center, geographically formed with a convex polygon. Motivated by an insightful observation with regard to estimation accuracy, we formulate a constrained optimization problem, in which we are interested in finding the radius of a circle such that a newly defined objective function is maximized. To solve this problem, we introduce an efficient optimal estimation algorithm whose runtime complexity is linear in the number of geo-tags in a dataset. In addition, we empirically evaluate the estimation performance of our GeoSocialBound algorithm for various environments and validate the complexity analysis. As a result, vital information on how to obtain real-world GeoSocialBounds with a high degree of accuracy is provided.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114717790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
So far away and yet so close: augmenting toponym disambiguation and similarity with text-based networks 如此遥远而又如此接近:与基于文本的网络增强地名消歧和相似性
Andreas Spitz, Johanna Geiß, Michael Gertz
Place similarity has a central role in geographic information retrieval and geographic information systems, where spatial proximity is frequently just a poor substitute for semantic relatedness. For applications such as toponym disambiguation, alternative measures are thus required to answer the non-trivial question of place similarity in a given context. In this paper, we discuss a novel approach to the construction of a network of locations from unstructured text data. By deriving similarity scores based on the textual distance of toponyms, we obtain a kind of relatedness that encodes the importance of the co-occurrences of place mentions. Based on the text of the English Wikipedia, we construct and provide such a network of place similarities, including entity linking to Wikidata as an augmentation of the contained information. In an analysis of centrality, we explore the networks capability of capturing the similarity between places. An evaluation of the network for the task of toponym disambiguation on the AIDA CoNLL-YAGO dataset reveals a performance that is in line with state-of-the-art methods.
地点相似性在地理信息检索和地理信息系统中起着核心作用,而空间接近性通常只是语义相关性的一个糟糕替代品。对于诸如地名消歧之类的应用程序,因此需要替代措施来回答给定上下文中地点相似性的重要问题。在本文中,我们讨论了一种从非结构化文本数据构建位置网络的新方法。根据地名的语篇距离得出相似度分数,得到一种对地名共现的重要性进行编码的关联度。基于英文维基百科的文本,我们构建并提供了这样一个地点相似度网络,包括链接到维基数据的实体,作为所包含信息的增强。在对中心性的分析中,我们探讨了网络捕捉地方之间相似性的能力。对AIDA CoNLL-YAGO数据集的地名消歧任务的网络评估显示,其性能与最先进的方法一致。
{"title":"So far away and yet so close: augmenting toponym disambiguation and similarity with text-based networks","authors":"Andreas Spitz, Johanna Geiß, Michael Gertz","doi":"10.1145/2948649.2948651","DOIUrl":"https://doi.org/10.1145/2948649.2948651","url":null,"abstract":"Place similarity has a central role in geographic information retrieval and geographic information systems, where spatial proximity is frequently just a poor substitute for semantic relatedness. For applications such as toponym disambiguation, alternative measures are thus required to answer the non-trivial question of place similarity in a given context. In this paper, we discuss a novel approach to the construction of a network of locations from unstructured text data. By deriving similarity scores based on the textual distance of toponyms, we obtain a kind of relatedness that encodes the importance of the co-occurrences of place mentions. Based on the text of the English Wikipedia, we construct and provide such a network of place similarities, including entity linking to Wikidata as an augmentation of the contained information. In an analysis of centrality, we explore the networks capability of capturing the similarity between places. An evaluation of the network for the task of toponym disambiguation on the AIDA CoNLL-YAGO dataset reveals a performance that is in line with state-of-the-art methods.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126411107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Dynamically ranked top-k spatial keyword search 动态排名前k的空间关键字搜索
S. Ray, B. Nickerson
With the growing data volume and popularity of Web services and Location-Based Services (LBS) new spatio-textual application are emerging. These applications are contributing to a deluge of geo-tagged documents. As a result, top-k spatial keyword searches have attracted a lot of attention and a number of spatio-textual indexes have been proposed. However, these indexes do not consider the "recency" of the indexed documents. Part of the challenge is due to the fact that the textual relevance score measures that these indexes use, require all documents to be inspected. To address these issues, we propose the idea of "dynamic ranking" of spatio-textual objects. We also introduce a novel index, called STARI, which uses this ranking method to retrieve the most recent top-k relevant objects. Experimental evaluation demonstrates that that our system can support high document update rates and low query latency.
随着数据量的增长以及Web服务和基于位置的服务(LBS)的普及,新的空间文本应用正在出现。这些应用程序导致了地理标记文档的泛滥。因此,top-k空间关键字搜索引起了人们的广泛关注,并提出了许多空间文本索引。但是,这些索引不考虑索引文档的“近时性”。部分挑战是由于这些索引使用的文本相关性评分度量要求检查所有文档。为了解决这些问题,我们提出了对空间文本对象进行“动态排序”的想法。我们还引入了一个名为STARI的新索引,它使用这种排序方法检索最近的top-k相关对象。实验结果表明,该系统具有较高的文档更新率和较低的查询延迟。
{"title":"Dynamically ranked top-k spatial keyword search","authors":"S. Ray, B. Nickerson","doi":"10.1145/2948649.2948655","DOIUrl":"https://doi.org/10.1145/2948649.2948655","url":null,"abstract":"With the growing data volume and popularity of Web services and Location-Based Services (LBS) new spatio-textual application are emerging. These applications are contributing to a deluge of geo-tagged documents. As a result, top-k spatial keyword searches have attracted a lot of attention and a number of spatio-textual indexes have been proposed. However, these indexes do not consider the \"recency\" of the indexed documents. Part of the challenge is due to the fact that the textual relevance score measures that these indexes use, require all documents to be inspected. To address these issues, we propose the idea of \"dynamic ranking\" of spatio-textual objects. We also introduce a novel index, called STARI, which uses this ranking method to retrieve the most recent top-k relevant objects. Experimental evaluation demonstrates that that our system can support high document update rates and low query latency.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134047651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data 第三届国际ACM SIGMOD管理和挖掘丰富的地理空间数据研讨会论文集
{"title":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","authors":"","doi":"10.1145/2948649","DOIUrl":"https://doi.org/10.1145/2948649","url":null,"abstract":"","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130172323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1