{"title":"揭示维基百科的空间相关性","authors":"Gianluca Quercini, H. Samet","doi":"10.1145/2666310.2666398","DOIUrl":null,"url":null,"abstract":"In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.","PeriodicalId":153031,"journal":{"name":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Uncovering the spatial relatedness in Wikipedia\",\"authors\":\"Gianluca Quercini, H. Samet\",\"doi\":\"10.1145/2666310.2666398\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.\",\"PeriodicalId\":153031,\"journal\":{\"name\":\"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2666310.2666398\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2666310.2666398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In a previous work we showed that the knowledge of the spatial reader scope of a news source, that is the geographical location for which its content has been primarily produced, plays an important role in disambiguating toponyms in news articles. The determination of the spatial reader scope of a news source is based on the notion of a local lexicon, which for a location l is defined as a set of concepts, such as names of people, landmarks and historical events, that are spatially related to l. The automatic determination of a local lexicon for a wide range of locations is key to implementing an efficient geotagged news retrieval system, such as NewsStand and its variants TwitterStand and PhotoStand. The major research challenge here is the measurement of the spatial relatedness of a concept to a location. Our previous work resorted to a similarity measure that used the geographic coordinates attached to the Wikipedia articles to find concepts that are spatially related to a certain location. Clearly, this results in local lexicons that mostly include spatial concepts, although non-spatial concepts, such as people or food specialties, are key elements of the identity of a location. In this paper, we explore a set of graph-based similarity measures to determine a local lexicon of a location from Wikipedia without using any spatial clues, based on the observation that the spatial relatedness of a concept to a location is hidden in the Wikipedia link structure. Our evaluation on the local lexicons of 1,200 locations indicates that our observation is well-founded. Additionally, we provide experiments on standard datasets that show that SynRank, one of the measures that we propose for computing the spatial relatedness of a concept to a location, rivals existing similarity measures in determining the semantic relatedness between wikipedia articles.