Geographic web usage estimation by monitoring DNS caches

Hüseyin Akcan, Torsten Suel, Hervé Brönnimann
{"title":"Geographic web usage estimation by monitoring DNS caches","authors":"Hüseyin Akcan, Torsten Suel, Hervé Brönnimann","doi":"10.1145/1367798.1367813","DOIUrl":null,"url":null,"abstract":"DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa. In order to improve their performance, DNS servers also keep temporary records of all requested domain names in their cache. While most of the DNS servers are configured to be used by their local users only, there still exist many DNS servers that respond to public queries. Querying these DNS servers reveals the recently visited domains. Exploiting the geographically distributed nature of DNS, one can gather usage statistics ranging from a single DNS server to global scale. In particular, this enables collecting statistics about geographic differences in web browsing behavior between different regions of a country or the world. In this paper, we present methods to identify these public DNS servers, discuss how to effectively crawl them, and describe our algorithm to extract usage estimations from the crawl data. We also evaluate our estimation algorithm using extensive simulations, and finally use our algorithms to crawl 150 U.S. universities for various domains, and explore the effects of location and time on the access rate of these domains.","PeriodicalId":320466,"journal":{"name":"International Workshop on Location and the Web","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Location and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1367798.1367813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa. In order to improve their performance, DNS servers also keep temporary records of all requested domain names in their cache. While most of the DNS servers are configured to be used by their local users only, there still exist many DNS servers that respond to public queries. Querying these DNS servers reveals the recently visited domains. Exploiting the geographically distributed nature of DNS, one can gather usage statistics ranging from a single DNS server to global scale. In particular, this enables collecting statistics about geographic differences in web browsing behavior between different regions of a country or the world. In this paper, we present methods to identify these public DNS servers, discuss how to effectively crawl them, and describe our algorithm to extract usage estimations from the crawl data. We also evaluate our estimation algorithm using extensive simulations, and finally use our algorithms to crawl 150 U.S. universities for various domains, and explore the effects of location and time on the access rate of these domains.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过监测DNS缓存来估计地理网络的使用情况
DNS是地球上使用最活跃的分布式数据库之一,每天有数百万人访问它,以透明地将主机名转换为IP地址,反之亦然。为了提高性能,DNS服务器还在缓存中保存所有请求域名的临时记录。虽然大多数DNS服务器被配置为仅供本地用户使用,但仍然存在许多响应公共查询的DNS服务器。查询这些DNS服务器可以显示最近访问过的域。利用DNS的地理分布特性,可以收集从单个DNS服务器到全球范围的使用统计信息。特别是,这可以收集有关一个国家或世界不同地区之间网络浏览行为的地理差异的统计数据。在本文中,我们提出了识别这些公共DNS服务器的方法,讨论了如何有效地抓取它们,并描述了从抓取数据中提取使用估计的算法。我们还使用广泛的模拟来评估我们的估计算法,并最终使用我们的算法抓取150所美国大学的各个领域,并探索位置和时间对这些领域访问率的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Two Ways of Thinking About Where People Go Automatic Zoom Level Prediction for Informal Location Descriptions HMM-based Address Parsing with Massive Synthetic Training Data Generation Hybrid Quantized Resource Descriptions for Geospatial Source Selection Considering Common Data Model for Indoor Location-aware Services
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1