Integration of linked data sources for gazetteer expansion

T. Moura, C. Davis
{"title":"Integration of linked data sources for gazetteer expansion","authors":"T. Moura, C. Davis","doi":"10.1145/2675354.2675357","DOIUrl":null,"url":null,"abstract":"The determination of the geographic scope of documents is important for many applications in geographic information retrieval (GIR). Many techniques require the use of gazetteers as a source of reference data. However, creating and maintaining gazetteers is still a complex and demanding task. We propose using linked data sources to put together gazetteer data that can be both broad (e.g. planetary) and deep (e.g., down to urban detail). Linked data sources also allow enriching the resulting gazetteer with a set of geographic and semantic relationships involving place names and other geographic and non-geographic terms, thus expanding the possibilities for solving typical GIR problems such as disambiguation and filtering. This work shows the results of efforts to combine two linked data sources of gazetteer data, namely GeoNames and DBPedia, to populate an integrated and semantically-enriched gazetteer. We used evidence contained in attributes, such as Wikipedia URLs, Linked Data predicates that indicate that places in both sources are the same, and some additional criteria. The resulting gazetteer contains 8,729,833 places, of which 426;317 are found in both data sources. This relatively small overlap is analyzed, indicating that GeoNames and DBPedia are complementary, covering typically different classes of places, thus leading to the idea that further expansion can be achieved by integrating gazetteer data from additional Linked Data sources.","PeriodicalId":286892,"journal":{"name":"Proceedings of the 8th Workshop on Geographic Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th Workshop on Geographic Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2675354.2675357","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The determination of the geographic scope of documents is important for many applications in geographic information retrieval (GIR). Many techniques require the use of gazetteers as a source of reference data. However, creating and maintaining gazetteers is still a complex and demanding task. We propose using linked data sources to put together gazetteer data that can be both broad (e.g. planetary) and deep (e.g., down to urban detail). Linked data sources also allow enriching the resulting gazetteer with a set of geographic and semantic relationships involving place names and other geographic and non-geographic terms, thus expanding the possibilities for solving typical GIR problems such as disambiguation and filtering. This work shows the results of efforts to combine two linked data sources of gazetteer data, namely GeoNames and DBPedia, to populate an integrated and semantically-enriched gazetteer. We used evidence contained in attributes, such as Wikipedia URLs, Linked Data predicates that indicate that places in both sources are the same, and some additional criteria. The resulting gazetteer contains 8,729,833 places, of which 426;317 are found in both data sources. This relatively small overlap is analyzed, indicating that GeoNames and DBPedia are complementary, covering typically different classes of places, thus leading to the idea that further expansion can be achieved by integrating gazetteer data from additional Linked Data sources.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
整合链接数据源的地名辞典扩展
在地理信息检索(GIR)的许多应用中,确定文献的地理范围是非常重要的。许多技术需要使用地名词典作为参考数据的来源。然而,创建和维护地名词典仍然是一项复杂而艰巨的任务。我们建议使用链接的数据源将地名词典数据放在一起,这些数据既可以是广泛的(例如,行星),也可以是深入的(例如,到城市细节)。链接数据源还允许使用一组涉及地名和其他地理和非地理术语的地理和语义关系来丰富生成的地名词典,从而扩展了解决典型GIR问题(如消歧和过滤)的可能性。这项工作展示了将两个相关联的地名词典数据源(即GeoNames和DBPedia)结合起来,以填充一个集成的、语义丰富的地名词典的结果。我们使用属性中包含的证据,例如Wikipedia url、表明两个源中的位置相同的关联数据谓词,以及一些附加标准。由此产生的地名词典包含8,729,833个地名,其中426,317个地名同时存在于两个数据源中。对这种相对较小的重叠进行了分析,表明GeoNames和DBPedia是互补的,覆盖了通常不同类别的地方,从而产生了通过集成来自其他关联数据源的地名词典数据可以实现进一步扩展的想法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Construction and first analysis of a corpus for the evaluation and training of microblog/twitter geoparsers Improving wikipedia-based place name disambiguation in short texts using structured data from DBpedia Indirect location recommendation Estimating the semantic type of events using location features from Flickr Characterization of toponym usages in texts
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1