This paper examines 1) the scope of geo-ontologies used for the purposes of information retrieval on the Web, 2) the core geographical concepts and their mutual relations, and 3) the properties the concepts have. Furthermore, we present the Finnish geo-ontology (Suomalainen paikkaontologia, SUO) and discuss the theories and principles that have governed the development process, as well as the limitations and requirements the use of geographical dictionaries as an instance data source have imposed to the content and the structure of SUO.
{"title":"Core geographical concepts: case Finnish geo-ontology","authors":"R. Henriksson, Tomi Kauppinen, E. Hyvönen","doi":"10.1145/1367798.1367807","DOIUrl":"https://doi.org/10.1145/1367798.1367807","url":null,"abstract":"This paper examines 1) the scope of geo-ontologies used for the purposes of information retrieval on the Web, 2) the core geographical concepts and their mutual relations, and 3) the properties the concepts have. Furthermore, we present the Finnish geo-ontology (Suomalainen paikkaontologia, SUO) and discuss the theories and principles that have governed the development process, as well as the limitations and requirements the use of geographical dictionaries as an instance data source have imposed to the content and the structure of SUO.","PeriodicalId":320466,"journal":{"name":"International Workshop on Location and the Web","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126032151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa. In order to improve their performance, DNS servers also keep temporary records of all requested domain names in their cache. While most of the DNS servers are configured to be used by their local users only, there still exist many DNS servers that respond to public queries. Querying these DNS servers reveals the recently visited domains. Exploiting the geographically distributed nature of DNS, one can gather usage statistics ranging from a single DNS server to global scale. In particular, this enables collecting statistics about geographic differences in web browsing behavior between different regions of a country or the world. In this paper, we present methods to identify these public DNS servers, discuss how to effectively crawl them, and describe our algorithm to extract usage estimations from the crawl data. We also evaluate our estimation algorithm using extensive simulations, and finally use our algorithms to crawl 150 U.S. universities for various domains, and explore the effects of location and time on the access rate of these domains.
{"title":"Geographic web usage estimation by monitoring DNS caches","authors":"Hüseyin Akcan, Torsten Suel, Hervé Brönnimann","doi":"10.1145/1367798.1367813","DOIUrl":"https://doi.org/10.1145/1367798.1367813","url":null,"abstract":"DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa. In order to improve their performance, DNS servers also keep temporary records of all requested domain names in their cache. While most of the DNS servers are configured to be used by their local users only, there still exist many DNS servers that respond to public queries. Querying these DNS servers reveals the recently visited domains. Exploiting the geographically distributed nature of DNS, one can gather usage statistics ranging from a single DNS server to global scale. In particular, this enables collecting statistics about geographic differences in web browsing behavior between different regions of a country or the world. In this paper, we present methods to identify these public DNS servers, discuss how to effectively crawl them, and describe our algorithm to extract usage estimations from the crawl data. We also evaluate our estimation algorithm using extensive simulations, and finally use our algorithms to crawl 150 U.S. universities for various domains, and explore the effects of location and time on the access rate of these domains.","PeriodicalId":320466,"journal":{"name":"International Workshop on Location and the Web","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125001431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The number of search queries that are associated with geographical locations, either explicitly or implicitly, has been quadrupled in recent years. For such geo-sensitive queries, the ability to accurately infer users' geographical preference greatly enhances their search experience. By mining past user clicks and constructing a geographical click probability distribution model, we address two important issues in spatial Web search: how do we determine whether a search query is geo-sensitive, and how do we detect, disambiguate, and visualize the associated geographical location(s). We present our empirical study on a large-scale dataset with about 9,000 unique queries randomly drawn from the logs of a popular commercial search engine Yahoo! Search, and about 430 million user clicks on 1.6M unique Web pages over an eight-month period. Our classification method achieved recall of 0.98 and precision of 0.75 in identifying geo-sensitive search queries. We also present our preliminary findings in using geographical click probability distributions to cluster search results for queries with geographical ambiguities.
{"title":"Modeling and visualizing geo-sensitive queries based on user clicks","authors":"Ziming Zhuang, Clifford Brunk, C. Lee Giles","doi":"10.1145/1367798.1367811","DOIUrl":"https://doi.org/10.1145/1367798.1367811","url":null,"abstract":"The number of search queries that are associated with geographical locations, either explicitly or implicitly, has been quadrupled in recent years. For such geo-sensitive queries, the ability to accurately infer users' geographical preference greatly enhances their search experience. By mining past user clicks and constructing a geographical click probability distribution model, we address two important issues in spatial Web search: how do we determine whether a search query is geo-sensitive, and how do we detect, disambiguate, and visualize the associated geographical location(s). We present our empirical study on a large-scale dataset with about 9,000 unique queries randomly drawn from the logs of a popular commercial search engine Yahoo! Search, and about 430 million user clicks on 1.6M unique Web pages over an eight-month period. Our classification method achieved recall of 0.98 and precision of 0.75 in identifying geo-sensitive search queries. We also present our preliminary findings in using geographical click probability distributions to cluster search results for queries with geographical ambiguities.","PeriodicalId":320466,"journal":{"name":"International Workshop on Location and the Web","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121663335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}