DNS depends on extensive caching for good performance, and every DNS zone owner must set Time-to-Live (TTL) values to control their DNS caching. Today there is relatively little guidance backed by research about how to set TTLs, and operators must balance conflicting demands of caching against agility of configuration. Exactly how TTL value choices affect operational networks is quite challenging to understand due to interactions across the distributed DNS service, where resolvers receive TTLs in different ways (answers and hints), TTLs are specified in multiple places (zones and their parent's glue), and while DNS resolution must be security-aware. This paper provides the first careful evaluation of how these multiple, interacting factors affect the effective cache lifetimes of DNS records, and provides recommendations for how to configure DNS TTLs based on our findings. We provide recommendations in TTL choice for different situations, and for where they must be configured. We show that longer TTLs have significant promise in reducing latency, reducing it from 183 ms to 28.7 ms for one country-code TLD.
{"title":"Cache Me If You Can: Effects of DNS Time-to-Live","authors":"G. Moura, J. Heidemann, R. Schmidt, W. Hardaker","doi":"10.1145/3355369.3355568","DOIUrl":"https://doi.org/10.1145/3355369.3355568","url":null,"abstract":"DNS depends on extensive caching for good performance, and every DNS zone owner must set Time-to-Live (TTL) values to control their DNS caching. Today there is relatively little guidance backed by research about how to set TTLs, and operators must balance conflicting demands of caching against agility of configuration. Exactly how TTL value choices affect operational networks is quite challenging to understand due to interactions across the distributed DNS service, where resolvers receive TTLs in different ways (answers and hints), TTLs are specified in multiple places (zones and their parent's glue), and while DNS resolution must be security-aware. This paper provides the first careful evaluation of how these multiple, interacting factors affect the effective cache lifetimes of DNS records, and provides recommendations for how to configure DNS TTLs based on our findings. We provide recommendations in TTL choice for different situations, and for where they must be configured. We show that longer TTLs have significant promise in reducing latency, reducing it from 183 ms to 28.7 ms for one country-code TLD.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"187 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81092462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cecilia Testart, P. Richter, Alistair King, A. Dainotti, D. Clark
BGP hijacks remain an acute problem in today's Internet, with widespread consequences. While hijack detection systems are readily available, they typically rely on a priori prefix-ownership information and are reactive in nature. In this work, we take on a new perspective on BGP hijacking activity: we introduce and track the long-term routing behavior of serial hijackers, networks that repeatedly hijack address blocks for malicious purposes, often over the course of many months or even years. Based on a ground truth dataset that we construct by extracting information from network operator mailing lists, we illuminate the dominant routing characteristics of serial hijackers, and how they differ from legitimate networks. We then distill features that can capture these behavioral differences and train a machine learning model to automatically identify Autonomous Systems (ASes) that exhibit characteristics similar to serial hijackers. Our classifier identifies ≈ 900 ASes with similar behavior in the global IPv4 routing table. We analyze and categorize these networks, finding a wide range of indicators of malicious activity, misconfiguration, as well as benign hijacking activity. Our work presents a solid first step towards identifying and understanding this important category of networks, which can aid network operators in taking proactive measures to defend themselves against prefix hijacking and serve as input for current and future detection systems.
{"title":"Profiling BGP Serial Hijackers: Capturing Persistent Misbehavior in the Global Routing Table","authors":"Cecilia Testart, P. Richter, Alistair King, A. Dainotti, D. Clark","doi":"10.1145/3355369.3355581","DOIUrl":"https://doi.org/10.1145/3355369.3355581","url":null,"abstract":"BGP hijacks remain an acute problem in today's Internet, with widespread consequences. While hijack detection systems are readily available, they typically rely on a priori prefix-ownership information and are reactive in nature. In this work, we take on a new perspective on BGP hijacking activity: we introduce and track the long-term routing behavior of serial hijackers, networks that repeatedly hijack address blocks for malicious purposes, often over the course of many months or even years. Based on a ground truth dataset that we construct by extracting information from network operator mailing lists, we illuminate the dominant routing characteristics of serial hijackers, and how they differ from legitimate networks. We then distill features that can capture these behavioral differences and train a machine learning model to automatically identify Autonomous Systems (ASes) that exhibit characteristics similar to serial hijackers. Our classifier identifies ≈ 900 ASes with similar behavior in the global IPv4 routing table. We analyze and categorize these networks, finding a wide range of indicators of malicious activity, misconfiguration, as well as benign hijacking activity. Our work presents a solid first step towards identifying and understanding this important category of networks, which can aid network operators in taking proactive measures to defend themselves against prefix hijacking and serve as input for current and future detection systems.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78845610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scanning of hosts on the Internet to identify vulnerable devices and services is a key component in many of today's cyberattacks. Tracking this scanning activity, in turn, provides an excellent signal to assess the current state-of-affairs for many vulnerabilities and their exploitation. So far, studies tracking scanning activity have relied on unsolicited traffic captured in darknets, focusing on random scans of the address space. In this work, we track scanning activity through the lens of unsolicited traffic captured at the firewalls of some 89,000 hosts of a major Content Distribution Network (CDN). Our vantage point has two distinguishing features compared to darknets: (i) it is distributed across some 1,300 networks, and (ii) its servers are live, offering services and thus emitting traffic. While all servers receive a baseline level of probing from Internet-wide scans, i.e., scans targeting random subsets of or the entire IPv4 space, we show that some 30% of all logged scan traffic is the result of localized scans. We find that localized scanning campaigns often target narrow regions in the address space, and that their characteristics in terms of target selection strategy and scanned services differ vastly from the more widely known Internet-wide scans. Our observations imply that conventional darknets can only partially illuminate scanning activity, and may severely underestimate widespread attempts to scan and exploit individual services in specific prefixes or networks. Our methods can be adapted for individual network operators to assess if they are subjected to targeted scanning activity.
{"title":"Scanning the Scanners: Sensing the Internet from a Massively Distributed Network Telescope","authors":"P. Richter, A. Berger","doi":"10.1145/3355369.3355595","DOIUrl":"https://doi.org/10.1145/3355369.3355595","url":null,"abstract":"Scanning of hosts on the Internet to identify vulnerable devices and services is a key component in many of today's cyberattacks. Tracking this scanning activity, in turn, provides an excellent signal to assess the current state-of-affairs for many vulnerabilities and their exploitation. So far, studies tracking scanning activity have relied on unsolicited traffic captured in darknets, focusing on random scans of the address space. In this work, we track scanning activity through the lens of unsolicited traffic captured at the firewalls of some 89,000 hosts of a major Content Distribution Network (CDN). Our vantage point has two distinguishing features compared to darknets: (i) it is distributed across some 1,300 networks, and (ii) its servers are live, offering services and thus emitting traffic. While all servers receive a baseline level of probing from Internet-wide scans, i.e., scans targeting random subsets of or the entire IPv4 space, we show that some 30% of all logged scan traffic is the result of localized scans. We find that localized scanning campaigns often target narrow regions in the address space, and that their characteristics in terms of target selection strategy and scanned services differ vastly from the more widely known Internet-wide scans. Our observations imply that conventional darknets can only partially illuminate scanning activity, and may severely underestimate widespread attempts to scan and exploit individual services in specific prefixes or networks. Our methods can be adapted for individual network operators to assess if they are subjected to targeted scanning activity.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74631827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Yeganeh, Ramakrishnan Durairajan, R. Rejaie, W. Willinger
The growing demand for an ever-increasing number of cloud services is profoundly transforming the Internet's interconnection or peering ecosystem, and one example is the emergence of "virtual private interconnections (VPIs)". However, due to the underlying technologies, these VPIs are not publicly visible and traffic traversing them remains largely hidden as it bypasses the public Internet. In particular, existing techniques for inferring Internet interconnections are unable to detect these VPIs and are also incapable of mapping them to the physical facility or geographic region where they are established. In this paper, we present a third-party measurement study aimed at revealing all the peerings between Amazon and the rest of the Internet. We describe our technique for inferring these peering links and pay special attention to inferring the VPIs associated with this largest cloud provider. We also present and evaluate a new method for pinning (i.e., geo-locating) each end of the inferred interconnections or peering links. Our study provides a first look at Amazon's peering fabric. In particular, by grouping Amazon's peerings based on their key features, we illustrate the specific role that each group plays in how Amazon peers with other networks.
{"title":"How Cloud Traffic Goes Hiding: A Study of Amazon's Peering Fabric","authors":"B. Yeganeh, Ramakrishnan Durairajan, R. Rejaie, W. Willinger","doi":"10.1145/3355369.3355602","DOIUrl":"https://doi.org/10.1145/3355369.3355602","url":null,"abstract":"The growing demand for an ever-increasing number of cloud services is profoundly transforming the Internet's interconnection or peering ecosystem, and one example is the emergence of \"virtual private interconnections (VPIs)\". However, due to the underlying technologies, these VPIs are not publicly visible and traffic traversing them remains largely hidden as it bypasses the public Internet. In particular, existing techniques for inferring Internet interconnections are unable to detect these VPIs and are also incapable of mapping them to the physical facility or geographic region where they are established. In this paper, we present a third-party measurement study aimed at revealing all the peerings between Amazon and the rest of the Internet. We describe our technique for inferring these peering links and pay special attention to inferring the VPIs associated with this largest cloud provider. We also present and evaluate a new method for pinning (i.e., geo-locating) each end of the inferred interconnections or peering links. Our study provides a first look at Amazon's peering fabric. In particular, by grouping Amazon's peerings based on their key features, we illustrate the specific role that each group plays in how Amazon peers with other networks.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75712969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pelayo Vallina, Álvaro Feal, Julien Gamba, N. Vallina-Rodriguez, A. Fernández
Modern privacy regulations, including the General Data Protection Regulation (GDPR) in the European Union, aim to control user tracking activities in websites and mobile applications. These privacy rules typically contain specific provisions and strict requirements for websites that provide sensitive material to end users such as sexual, religious, and health services. However, little is known about the privacy risks that users face when visiting such websites, and about their regulatory compliance. In this paper, we present the first comprehensive and large-scale analysis of 6,843 pornographic websites. We provide an exhaustive behavioral analysis of the use of tracking methods by these websites, and their lack of regulatory compliance, including the absence of age-verification mechanisms and methods to obtain informed user consent. The results indicate that, as in the regular web, tracking is prevalent across pornographic sites: 72% of the websites use third-party cookies and 5% leverage advanced user fingerprinting technologies. Yet, our analysis reveals a third-party tracking ecosystem semi-decoupled from the regular web in which various analytics and advertising services track users across, and outside, pornographic websites. We complete the paper with a regulatory compliance analysis in the context of the EU GDPR, and newer legal requirements to implement verifiable access control mechanisms (e.g., UK's Digital Economy Act). We find that only 16% of the analyzed websites have an accessible privacy policy and only 4% provide a cookie consent banner. The use of verifiable access control mechanisms is limited to prominent pornographic websites.
{"title":"Tales from the Porn: A Comprehensive Privacy Analysis of the Web Porn Ecosystem","authors":"Pelayo Vallina, Álvaro Feal, Julien Gamba, N. Vallina-Rodriguez, A. Fernández","doi":"10.1145/3355369.3355583","DOIUrl":"https://doi.org/10.1145/3355369.3355583","url":null,"abstract":"Modern privacy regulations, including the General Data Protection Regulation (GDPR) in the European Union, aim to control user tracking activities in websites and mobile applications. These privacy rules typically contain specific provisions and strict requirements for websites that provide sensitive material to end users such as sexual, religious, and health services. However, little is known about the privacy risks that users face when visiting such websites, and about their regulatory compliance. In this paper, we present the first comprehensive and large-scale analysis of 6,843 pornographic websites. We provide an exhaustive behavioral analysis of the use of tracking methods by these websites, and their lack of regulatory compliance, including the absence of age-verification mechanisms and methods to obtain informed user consent. The results indicate that, as in the regular web, tracking is prevalent across pornographic sites: 72% of the websites use third-party cookies and 5% leverage advanced user fingerprinting technologies. Yet, our analysis reveals a third-party tracking ecosystem semi-decoupled from the regular web in which various analytics and advertising services track users across, and outside, pornographic websites. We complete the paper with a regulatory compliance analysis in the context of the EU GDPR, and newer legal requirements to implement verifiable access control mechanisms (e.g., UK's Digital Economy Act). We find that only 16% of the analyzed websites have an accessible privacy policy and only 4% provide a cookie consent banner. The use of verifiable access control mechanisms is limited to prominent pornographic websites.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90365866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingjing Ren, Daniel J. Dubois, D. Choffnes, A. Mandalari, Roman Kolcun, H. Haddadi
Internet of Things (IoT) devices are increasingly found in everyday homes, providing useful functionality for devices such as TVs, smart speakers, and video doorbells. Along with their benefits come potential privacy risks, since these devices can communicate information about their users to other parties over the Internet. However, understanding these risks in depth and at scale is difficult due to heterogeneity in devices' user interfaces, protocols, and functionality. In this work, we conduct a multidimensional analysis of information exposure from 81 devices located in labs in the US and UK. Through a total of 34,586 rigorous automated and manual controlled experiments, we characterize information exposure in terms of destinations of Internet traffic, whether the contents of communication are protected by encryption, what are the IoT-device interactions that can be inferred from such content, and whether there are unexpected exposures of private and/or sensitive information (e.g., video surreptitiously transmitted by a recording device). We highlight regional differences between these results, potentially due to different privacy regulations in the US and UK. Last, we compare our controlled experiments with data gathered from an in situ user study comprising 36 participants.
{"title":"Information Exposure From Consumer IoT Devices: A Multidimensional, Network-Informed Measurement Approach","authors":"Jingjing Ren, Daniel J. Dubois, D. Choffnes, A. Mandalari, Roman Kolcun, H. Haddadi","doi":"10.1145/3355369.3355577","DOIUrl":"https://doi.org/10.1145/3355369.3355577","url":null,"abstract":"Internet of Things (IoT) devices are increasingly found in everyday homes, providing useful functionality for devices such as TVs, smart speakers, and video doorbells. Along with their benefits come potential privacy risks, since these devices can communicate information about their users to other parties over the Internet. However, understanding these risks in depth and at scale is difficult due to heterogeneity in devices' user interfaces, protocols, and functionality. In this work, we conduct a multidimensional analysis of information exposure from 81 devices located in labs in the US and UK. Through a total of 34,586 rigorous automated and manual controlled experiments, we characterize information exposure in terms of destinations of Internet traffic, whether the contents of communication are protected by encryption, what are the IoT-device interactions that can be inferred from such content, and whether there are unexpected exposures of private and/or sensitive information (e.g., video surreptitiously transmitted by a recording device). We highlight regional differences between these results, potentially due to different privacy regulations in the US and UK. Last, we compare our controlled experiments with data gathered from an in situ user study comprising 36 participants.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88496143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Domain Name System (DNS) is thought of as having the simple-sounding task of resolving domains into IP addresses. With its stub resolvers, different layers of recursive resolvers, authoritative nameservers, a multitude of query types, and DNSSEC, the DNS ecosystem is actually quite complex. In this paper, we introduce DNS Observatory: a new stream analytics platform that provides a bird's-eye view on the DNS. As the data source, we leverage a large stream of passive DNS observations produced by hundreds of globally distributed probes, acquiring a peak of 200 k DNS queries per second between recursive resolvers and authoritative nameservers. For each observed DNS transaction, we extract traffic features, aggregate them, and track the top-k DNS objects, e.g., the top authoritative nameserver IP addresses or the top domains. We analyze 1.6 trillion DNS transactions over a four month period. This allows us to characterize DNS deployments and traffic patterns, evaluate its associated infrastructure and performance, as well as gain insight into the modern additions to the DNS and related Internet protocols. We find an alarming concentration of DNS traffic: roughly half of the observed traffic is handled by only 1 k authoritative nameservers and by 10 AS operators. By evaluating the median delay of DNS queries, we find that the top 10 k nameservers have indeed a shorter response time than less popular nameservers, which is correlated with less router hops. We also study how DNS TTL adjustments can impact query volumes, anticipate upcoming changes to DNS infrastructure, and how negative caching TTLs affect the Happy Eyeballs algorithm. We find some popular domains with a a share of up to 90 % of empty DNS responses due to short negative caching TTLs. We propose actionable measures to improve uncovered DNS shortcomings.
域名系统(DNS)被认为具有将域解析为IP地址的简单任务。由于它的存根解析器、不同层次的递归解析器、权威名称服务器、多种查询类型和DNSSEC, DNS生态系统实际上相当复杂。在本文中,我们介绍了一个新的流分析平台DNS Observatory,它提供了对DNS的鸟瞰。作为数据源,我们利用由数百个全球分布式探测器产生的大量被动DNS观察数据流,在递归解析器和权威名称服务器之间获得每秒200 k DNS查询的峰值。对于每个观察到的DNS交易,我们提取流量特征,汇总它们,并跟踪top-k DNS对象,例如,顶级权威域名服务器IP地址或顶级域。我们在四个月的时间里分析了1.6万亿DNS交易。这使我们能够描述DNS部署和流量模式,评估其相关的基础设施和性能,以及深入了解DNS和相关互联网协议的现代附加功能。我们发现了一个令人担忧的DNS流量集中:大约一半的观察到的流量仅由1 k权威名称服务器和10个AS运营商处理。通过评估DNS查询的中位数延迟,我们发现排名前10 k的域名服务器确实比不太受欢迎的域名服务器具有更短的响应时间,这与较少的路由器跳数相关。我们还研究了DNS TTL调整如何影响查询量,预测DNS基础设施即将发生的变化,以及负面缓存TTL如何影响Happy Eyeballs算法。我们发现一些受欢迎的域名由于短的负缓存ttl而占空DNS响应的份额高达90%。我们提出可行的措施来改进未发现的DNS缺陷。
{"title":"DNS Observatory: The Big Picture of the DNS","authors":"Pawel Foremski, Oliver Gasser, G. Moura","doi":"10.1145/3355369.3355566","DOIUrl":"https://doi.org/10.1145/3355369.3355566","url":null,"abstract":"The Domain Name System (DNS) is thought of as having the simple-sounding task of resolving domains into IP addresses. With its stub resolvers, different layers of recursive resolvers, authoritative nameservers, a multitude of query types, and DNSSEC, the DNS ecosystem is actually quite complex. In this paper, we introduce DNS Observatory: a new stream analytics platform that provides a bird's-eye view on the DNS. As the data source, we leverage a large stream of passive DNS observations produced by hundreds of globally distributed probes, acquiring a peak of 200 k DNS queries per second between recursive resolvers and authoritative nameservers. For each observed DNS transaction, we extract traffic features, aggregate them, and track the top-k DNS objects, e.g., the top authoritative nameserver IP addresses or the top domains. We analyze 1.6 trillion DNS transactions over a four month period. This allows us to characterize DNS deployments and traffic patterns, evaluate its associated infrastructure and performance, as well as gain insight into the modern additions to the DNS and related Internet protocols. We find an alarming concentration of DNS traffic: roughly half of the observed traffic is handled by only 1 k authoritative nameservers and by 10 AS operators. By evaluating the median delay of DNS queries, we find that the top 10 k nameservers have indeed a shorter response time than less popular nameservers, which is correlated with less router hops. We also study how DNS TTL adjustments can impact query volumes, anticipate upcoming changes to DNS infrastructure, and how negative caching TTLs affect the Happy Eyeballs algorithm. We find some popular domains with a a share of up to 90 % of empty DNS responses due to short negative caching TTLs. We propose actionable measures to improve uncovered DNS shortcomings.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87829936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Collier, Daniel R. Thomas, R. Clayton, Alice Hutchings
Illegal booter services offer denial of service (DoS) attacks for a fee of a few tens of dollars a month. Internationally, police have implemented a range of different types of intervention aimed at those using and offering booter services, including arrests and website takedown. In order to measure the impact of these interventions we look at the usage reports that booters themselves provide and at measurements of reflected UDP DoS attacks, leveraging a five year measurement dataset that has been statistically demonstrated to have very high coverage. We analysed time series data (using a negative binomial regression model) to show that several interventions have had a statistically significant impact on the number of attacks. We show that, while there is no consistent effect of highly-publicised court cases, takedowns of individual booters precede significant, but short-lived, reductions in recorded attack numbers. However, more wide-ranging disruptions have much longer effects. The closure of HackForums' booter market reduced attacks for 13 weeks globally (and for longer in particular countries) and the FBI's coordinated operation in December 2018, which involved both takedowns and arrests, reduced attacks by a third for at least 10 weeks and resulted in lasting change to the structure of the booter market.
{"title":"Booting the Booters: Evaluating the Effects of Police Interventions in the Market for Denial-of-Service Attacks","authors":"Ben Collier, Daniel R. Thomas, R. Clayton, Alice Hutchings","doi":"10.1145/3355369.3355592","DOIUrl":"https://doi.org/10.1145/3355369.3355592","url":null,"abstract":"Illegal booter services offer denial of service (DoS) attacks for a fee of a few tens of dollars a month. Internationally, police have implemented a range of different types of intervention aimed at those using and offering booter services, including arrests and website takedown. In order to measure the impact of these interventions we look at the usage reports that booters themselves provide and at measurements of reflected UDP DoS attacks, leveraging a five year measurement dataset that has been statistically demonstrated to have very high coverage. We analysed time series data (using a negative binomial regression model) to show that several interventions have had a statistically significant impact on the number of attacks. We show that, while there is no consistent effect of highly-publicised court cases, takedowns of individual booters precede significant, but short-lived, reductions in recorded attack numbers. However, more wide-ranging disruptions have much longer effects. The closure of HackForums' booter market reduced attacks for 13 weeks globally (and for longer in particular countries) and the FBI's coordinated operation in December 2018, which involved both takedowns and arrests, reduced attacks by a third for at least 10 weeks and resulted in lasting change to the structure of the booter market.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79305321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mshabab Alrizah, Sencun Zhu, Xinyu Xing, Gang Wang
Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.
{"title":"Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems","authors":"Mshabab Alrizah, Sencun Zhu, Xinyu Xing, Gang Wang","doi":"10.1145/3355369.3355588","DOIUrl":"https://doi.org/10.1145/3355369.3355588","url":null,"abstract":"Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"87 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91515229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ranysha Ware, Matthew K. Mukerjee, S. Seshan, Justine Sherry
BBR is a new congestion control algorithm (CCA) deployed for Chromium QUIC and the Linux kernel. As the default CCA for YouTube (which commands 11+% of Internet traffic), BBR has rapidly become a major player in Internet congestion control. BBR's fairness or friendliness to other connections has recently come under scrutiny as measurements from multiple research groups have shown undesirable outcomes when BBR competes with traditional CCAs. One such outcome is a fixed, 40% proportion of link capacity consumed by a single BBR flow when competing with as many as 16 loss-based algorithms like Cubic or Reno. In this short paper, we provide the first model capturing BBR's behavior in competition with loss-based CCAs. Our model is coupled with practical experiments to validate its implications. The key lesson is this: under competition, BBR becomes window-limited by its 'in-flight cap' which then determines BBR's bandwidth consumption. By modeling the value of BBR's in-flight cap under varying network conditions, we can predict BBR's throughput when competing against Cubic flows with a median error of 5%, and against Reno with a median of 8%.
{"title":"Modeling BBR's Interactions with Loss-Based Congestion Control","authors":"Ranysha Ware, Matthew K. Mukerjee, S. Seshan, Justine Sherry","doi":"10.1145/3355369.3355604","DOIUrl":"https://doi.org/10.1145/3355369.3355604","url":null,"abstract":"BBR is a new congestion control algorithm (CCA) deployed for Chromium QUIC and the Linux kernel. As the default CCA for YouTube (which commands 11+% of Internet traffic), BBR has rapidly become a major player in Internet congestion control. BBR's fairness or friendliness to other connections has recently come under scrutiny as measurements from multiple research groups have shown undesirable outcomes when BBR competes with traditional CCAs. One such outcome is a fixed, 40% proportion of link capacity consumed by a single BBR flow when competing with as many as 16 loss-based algorithms like Cubic or Reno. In this short paper, we provide the first model capturing BBR's behavior in competition with loss-based CCAs. Our model is coupled with practical experiments to validate its implications. The key lesson is this: under competition, BBR becomes window-limited by its 'in-flight cap' which then determines BBR's bandwidth consumption. By modeling the value of BBR's in-flight cap under varying network conditions, we can predict BBR's throughput when competing against Cubic flows with a median error of 5%, and against Reno with a median of 8%.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84512810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}