{"title":"揭示维基百科的空间相关性","authors":"Gianluca Quercini, H. Samet","doi":"10.1145/2666310.2666398","DOIUrl":null,"url":null,"abstract":"In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Uncovering the spatial relatedness in Wikipedia\",\"authors\":\"Gianluca Quercini, H. Samet\",\"doi\":\"10.1145/2666310.2666398\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.\",\"PeriodicalId\":153031,\"journal\":{\"name\":\"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2666310.2666398\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2666310.2666398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

在之前的工作中,我们表明,新闻来源的空间读者范围的知识,即其内容主要产生的地理位置,在消除新闻文章中的地名歧义方面起着重要作用。新闻来源的空间读者范围的确定是基于本地词典的概念,对于位置l来说,本地词典被定义为一组概念,如人名、地标和历史事件,这些概念在空间上与l相关。自动确定广泛位置的本地词典是实现高效地理标记新闻检索系统(如NewsStand及其变体TwitterStand和PhotoStand)的关键。这里的主要研究挑战是测量一个概念与一个位置的空间相关性。我们之前的工作采用相似度度量,使用维基百科文章附带的地理坐标来查找与特定位置相关的空间概念。显然,这导致本地词汇大多包含空间概念,尽管非空间概念,如人或食物特色,是一个地方身份的关键要素。在本文中,我们探索了一组基于图的相似性度量,在不使用任何空间线索的情况下,从维基百科中确定一个位置的本地词典,基于观察到一个概念与一个位置的空间相关性隐藏在维基百科的链接结构中。我们对1200个地点的当地词汇进行了评估,结果表明我们的观察是有根据的。此外,我们提供了在标准数据集上的实验,表明SynRank(我们提出的用于计算概念与位置的空间相关性的度量之一)在确定维基百科文章之间的语义相关性方面与现有的相似性度量相竞争。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Uncovering the spatial relatedness in Wikipedia
In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A parallel query engine for interactive spatiotemporal analysis Spatio-temporal trajectory simplification for inferring travel paths Parameterized spatial query processing based on social probabilistic clustering Accurate and efficient map matching for challenging environments Top-k point of interest retrieval using standard indexes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1