We present the design, implementation, evaluation, and validation of a system that automatically learns to extract router names (router identifiers) from hostnames stored by network operators in different DNS zones, which we represent by regular expressions (regexes). Our supervised-learning approach evaluates automatically generated candidate regexes against sets of hostnames for IP addresses that other alias resolution techniques previously inferred to identify interfaces on the same router. Conceptually, if three conditions hold: (1) a regex extracts the same value from a set of hostnames associated with IP addresses on the same router; (2) the value is unique to that router; and (3) the regex extracts names for multiple routers in the suffix, then we conclude the regex accurately represents the naming convention for the suffix. We train our system using router aliases inferred from active probing to learn regexes for 2550 different suffixes. We then demonstrate the utility of this system by using the regexes to find 105% additional aliases for these suffixes. Regexes inferred in IPv4 perfectly predict aliases for ≈85% of suffixes with IPv6 aliases, i.e., IPv4 and IPv6 addresses representing the same underlying router, and find 9.0 times more routers in IPv6 than found by prior techniques.
{"title":"Learning Regexes to Extract Router Names from Hostnames","authors":"M. Luckie, B. Huffaker, K. Claffy","doi":"10.1145/3355369.3355589","DOIUrl":"https://doi.org/10.1145/3355369.3355589","url":null,"abstract":"We present the design, implementation, evaluation, and validation of a system that automatically learns to extract router names (router identifiers) from hostnames stored by network operators in different DNS zones, which we represent by regular expressions (regexes). Our supervised-learning approach evaluates automatically generated candidate regexes against sets of hostnames for IP addresses that other alias resolution techniques previously inferred to identify interfaces on the same router. Conceptually, if three conditions hold: (1) a regex extracts the same value from a set of hostnames associated with IP addresses on the same router; (2) the value is unique to that router; and (3) the regex extracts names for multiple routers in the suffix, then we conclude the regex accurately represents the naming convention for the suffix. We train our system using router aliases inferred from active probing to learn regexes for 2550 different suffixes. We then demonstrate the utility of this system by using the regexes to find 105% additional aliases for these suffixes. Regexes inferred in IPv4 perfectly predict aliases for ≈85% of suffixes with IPv6 aliases, i.e., IPv4 and IPv6 addresses representing the same underlying router, and find 9.0 times more routers in IPv6 than found by prior techniques.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74716856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Cao, Javad Nejati, A. Balasubramanian, Anshul Gandhi
Given the growing significance of network performance, it is crucial to examine how to make the most of available network options and protocols. We propose ECON, a model that predicts performance of applications under different protocols and network conditions to scalably make better network choices. ECON is built on an analytical framework to predict TCP performance, and uses the TCP model as a building block for predicting application performance. ECON infers a relationship between loss and congestion using empirical data that drives an online model to predict TCP performance. ECON then builds on the TCP model to predict latency and HTTP performance. Across four wired and one wireless network, our model outperforms seven alternative TCP models. We demonstrate how ECON (i) can be used by a Web server application to choose between HTTP/1.1 and HTTP/2 for a given Web page and network condition, and (ii) can be used by a video application to choose the optimal bitrate that maximizes video quality without rebuffering.
{"title":"ECON: Modeling the network to improve application performance","authors":"Yi Cao, Javad Nejati, A. Balasubramanian, Anshul Gandhi","doi":"10.1145/3355369.3355578","DOIUrl":"https://doi.org/10.1145/3355369.3355578","url":null,"abstract":"Given the growing significance of network performance, it is crucial to examine how to make the most of available network options and protocols. We propose ECON, a model that predicts performance of applications under different protocols and network conditions to scalably make better network choices. ECON is built on an analytical framework to predict TCP performance, and uses the TCP model as a building block for predicting application performance. ECON infers a relationship between loss and congestion using empirical data that drives an online model to predict TCP performance. ECON then builds on the TCP model to predict latency and HTTP performance. Across four wired and one wireless network, our model outperforms seven alternative TCP models. We demonstrate how ECON (i) can be used by a Web server application to choose between HTTP/1.1 and HTTP/2 for a given Web page and network condition, and (ii) can be used by a video application to choose the optimal bitrate that maximizes video quality without rebuffering.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80785513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Cao, Arpit Jain, K. Sharma, A. Balasubramanian, Anshul Gandhi
This short paper presents a detailed empirical study of BBR's performance under different real-world and emulated testbeds across a range of network operating conditions. Our empirical results help to identify network conditions under which BBR outperforms, in terms of goodput, contemporary TCP congestion control algorithms. We find that BBR is well suited for networks with shallow buffers, despite its high retransmissions, whereas existing loss-based algorithms are better suited for deep buffers. To identify the root causes of BBR's limitations, we carefully analyze our empirical results. Our analysis reveals that, contrary to BBR's design goal, BBR often exhibits large queue sizes. Further, the regimes where BBR performs well are often the same regimes where BBR is unfair to competing flows. Finally, we demonstrate the existence of a loss rate "cliff point" beyond which BBR's goodput drops abruptly. Our empirical investigation identifies the likely culprits in each of these cases as specific design options in BBR's source code.
{"title":"When to use and when not to use BBR: An empirical analysis and evaluation study","authors":"Yi Cao, Arpit Jain, K. Sharma, A. Balasubramanian, Anshul Gandhi","doi":"10.1145/3355369.3355579","DOIUrl":"https://doi.org/10.1145/3355369.3355579","url":null,"abstract":"This short paper presents a detailed empirical study of BBR's performance under different real-world and emulated testbeds across a range of network operating conditions. Our empirical results help to identify network conditions under which BBR outperforms, in terms of goodput, contemporary TCP congestion control algorithms. We find that BBR is well suited for networks with shallow buffers, despite its high retransmissions, whereas existing loss-based algorithms are better suited for deep buffers. To identify the root causes of BBR's limitations, we carefully analyze our empirical results. Our analysis reveals that, contrary to BBR's design goal, BBR often exhibits large queue sizes. Further, the regimes where BBR performs well are often the same regimes where BBR is unfair to competing flows. Finally, we demonstrate the existence of a loss rate \"cliff point\" beyond which BBR's goodput drops abruptly. Our empirical investigation identifies the likely culprits in each of these cases as specific design options in BBR's source code.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87440228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Bashir, Sajjad Arshad, E. Kirda, William K. Robertson, Christo Wilson
Programmatic advertising provides digital ad buyers with the convenience of purchasing ad impressions through Real Time Bidding (RTB) auctions. However, programmatic advertising has also given rise to a novel form of ad fraud known as domain spoofing, in which attackers sell counterfeit impressions that claim to be from high-value publishers. To mitigate domain spoofing, the Interactive Advertising Bureau (IAB) Tech Lab introduced the ads.txt standard in May 2017 to help ad buyers verify authorized digital ad sellers, as well as to promote overall transparency in programmatic advertising. In this work, we present a 15-month longitudinal, observational study of the ads.txt standard. We do this to understand (1) if it is helping ad buyers to combat domain spoofing and (2) whether the transparency offered by the standard can provide useful data to researchers and privacy advocates. With respect to halting domain spoofing, we observe that over 60% of Alexa Top-100K publishers that run RTB ads have adopted ads.txt, and that ad exchanges and advertisers appear to be honoring the standard. With respect to transparency, the widespread adoption of ads.txt allows us to explicitly identify over 1,000 domains belonging to ad exchanges, without having to rely on crowdsourcing or heuristic methods. However, we also find that ads.txt is still a long way from reaching its full potential. Many publishers have yet to adopt the standard, and we observe major ad exchanges purchasing unauthorized impressions that violate the standard. This opens the door to domain spoofing attacks. Further, ads.txt data often include errors that must be cleaned and mitigated before the data is practically useful.
{"title":"A Longitudinal Analysis of the ads.txt Standard","authors":"M. Bashir, Sajjad Arshad, E. Kirda, William K. Robertson, Christo Wilson","doi":"10.1145/3355369.3355603","DOIUrl":"https://doi.org/10.1145/3355369.3355603","url":null,"abstract":"Programmatic advertising provides digital ad buyers with the convenience of purchasing ad impressions through Real Time Bidding (RTB) auctions. However, programmatic advertising has also given rise to a novel form of ad fraud known as domain spoofing, in which attackers sell counterfeit impressions that claim to be from high-value publishers. To mitigate domain spoofing, the Interactive Advertising Bureau (IAB) Tech Lab introduced the ads.txt standard in May 2017 to help ad buyers verify authorized digital ad sellers, as well as to promote overall transparency in programmatic advertising. In this work, we present a 15-month longitudinal, observational study of the ads.txt standard. We do this to understand (1) if it is helping ad buyers to combat domain spoofing and (2) whether the transparency offered by the standard can provide useful data to researchers and privacy advocates. With respect to halting domain spoofing, we observe that over 60% of Alexa Top-100K publishers that run RTB ads have adopted ads.txt, and that ad exchanges and advertisers appear to be honoring the standard. With respect to transparency, the widespread adoption of ads.txt allows us to explicitly identify over 1,000 domains belonging to ad exchanges, without having to rely on crowdsourcing or heuristic methods. However, we also find that ads.txt is still a long way from reaching its full potential. Many publishers have yet to adopt the standard, and we observe major ad exchanges purchasing unauthorized impressions that violate the standard. This opens the door to domain spoofing attacks. Further, ads.txt data often include errors that must be cleaned and mitigated before the data is practically useful.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75270000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding and characterizing the reliability of a mobile broadband network is a challenging task due to the presence of a multitude of root causes that operate at different temporal and spatial scales. This, in turn, limits the use of classical statistical methods for characterizing the mobile network's reliability. We propose leveraging tensor factorizations, a well-established data mining method, to address this challenge. We represent a year-long time series of outages, from two mobile operators as multi-way arrays, and demonstrate how tensor factorizations help in extracting the outage patterns at various time-scales, making it easy to locate possible root causes. Unlike traditional methods of time series analysis, tensor factorizations provide a compact and interpretable picture of outages.
{"title":"Multiway Reliability Analysis of Mobile Broadband Networks","authors":"Mah-Rukh Fida, E. Acar, A. Elmokashfi","doi":"10.1145/3355369.3355591","DOIUrl":"https://doi.org/10.1145/3355369.3355591","url":null,"abstract":"Understanding and characterizing the reliability of a mobile broadband network is a challenging task due to the presence of a multitude of root causes that operate at different temporal and spatial scales. This, in turn, limits the use of classical statistical methods for characterizing the mobile network's reliability. We propose leveraging tensor factorizations, a well-established data mining method, to address this challenge. We represent a year-long time series of outages, from two mobile operators as multi-way arrays, and demonstrate how tensor factorizations help in extracting the outage patterns at various time-scales, making it easy to locate possible root causes. Unlike traditional methods of time series analysis, tensor factorizations provide a compact and interpretable picture of outages.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87070031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brandon Schlinker, Ítalo F. S. Cunha, Yi-Ching Chiu, S. Sundaresan, Ethan Katz-Bassett
We examine the current state of user network performance and opportunities to improve it from the vantage point of Facebook, a global content provider. Facebook serves over 2 billion users distributed around the world using a network of PoPs and interconnections spread across 6 continents. In this paper, we execute a large-scale, 10-day measurement study of metrics at the TCP and HTTP layers for production user traffic at all of Facebook's PoPs worldwide, collecting performance measurements for hundreds of trillions of sampled HTTP sessions. We discuss our approach to collecting and analyzing measurements, including a novel approach to characterizing user achievable goodput from the server side. We find that most user sessions have MinRTT less than 39ms and can support HD video. We investigate if it is possible to improve performance by incorporating performance information into Facebook's routing decisions; we find that default routing by Facebook is largely optimal. To our knowledge, our measurement study is the first characterization of user performance on today's Internet from the vantage point of a global content provider.
{"title":"Internet Performance from Facebook's Edge","authors":"Brandon Schlinker, Ítalo F. S. Cunha, Yi-Ching Chiu, S. Sundaresan, Ethan Katz-Bassett","doi":"10.1145/3355369.3355567","DOIUrl":"https://doi.org/10.1145/3355369.3355567","url":null,"abstract":"We examine the current state of user network performance and opportunities to improve it from the vantage point of Facebook, a global content provider. Facebook serves over 2 billion users distributed around the world using a network of PoPs and interconnections spread across 6 continents. In this paper, we execute a large-scale, 10-day measurement study of metrics at the TCP and HTTP layers for production user traffic at all of Facebook's PoPs worldwide, collecting performance measurements for hundreds of trillions of sampled HTTP sessions. We discuss our approach to collecting and analyzing measurements, including a novel approach to characterizing user achievable goodput from the server side. We find that most user sessions have MinRTT less than 39ms and can support HD video. We investigate if it is possible to improve performance by incorporating performance information into Facebook's routing decisions; we find that default routing by Facebook is largely optimal. To our knowledge, our measurement study is the first characterization of user performance on today's Internet from the vantage point of a global content provider.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88145085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Müller, Matthew Thomas, D. Wessels, W. Hardaker, Taejoong Chung, W. Toorop, R. V. Rijswijk-Deij
The DNS Security Extensions (DNSSEC) add authenticity and integrity to the naming system of the Internet. Resolvers that validate information in the DNS need to know the cryptographic public key used to sign the root zone of the DNS. Eight years after its introduction and one year after the originally scheduled date, this key was replaced by ICANN for the first time in October 2018. ICANN considered this event, called a rollover, "an overwhelming success" and during the rollover they detected "no significant outages". In this paper, we independently follow the process of the rollover starting from the events that led to its postponement in 2017 until the removal of the old key in 2019. We collected data from multiple vantage points in the DNS ecosystem for the entire duration of the rollover process. Using this data, we study key events of the rollover. These events include telemetry signals that led to the rollover being postponed, a near real-time view of the actual rollover in resolvers and a significant increase in queries to the root of the DNS once the old key was revoked. Our analysis contributes significantly to identifying the causes of challenges observed during the rollover. We show that while from an end-user perspective, the roll indeed passed without major problems, there are many opportunities for improvement and important lessons to be learned from events that occurred over the entire duration of the rollover. Based on these lessons, we propose improvements to the process for future rollovers.
{"title":"Roll, Roll, Roll your Root: A Comprehensive Analysis of the First Ever DNSSEC Root KSK Rollover","authors":"M. Müller, Matthew Thomas, D. Wessels, W. Hardaker, Taejoong Chung, W. Toorop, R. V. Rijswijk-Deij","doi":"10.1145/3355369.3355570","DOIUrl":"https://doi.org/10.1145/3355369.3355570","url":null,"abstract":"The DNS Security Extensions (DNSSEC) add authenticity and integrity to the naming system of the Internet. Resolvers that validate information in the DNS need to know the cryptographic public key used to sign the root zone of the DNS. Eight years after its introduction and one year after the originally scheduled date, this key was replaced by ICANN for the first time in October 2018. ICANN considered this event, called a rollover, \"an overwhelming success\" and during the rollover they detected \"no significant outages\". In this paper, we independently follow the process of the rollover starting from the events that led to its postponement in 2017 until the removal of the old key in 2019. We collected data from multiple vantage points in the DNS ecosystem for the entire duration of the rollover process. Using this data, we study key events of the rollover. These events include telemetry signals that led to the rollover being postponed, a near real-time view of the actual rollover in resolvers and a significant increase in queries to the root of the DNS once the old key was revoked. Our analysis contributes significantly to identifying the causes of challenges observed during the rollover. We show that while from an end-user perspective, the roll indeed passed without major problems, there are many opportunities for improvement and important lessons to be learned from events that occurred over the entire duration of the rollover. Based on these lessons, we propose improvements to the process for future rollovers.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87934024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sai Teja Peddinti, Igor Bilogrevic, N. Taft, M. Pelikán, Ú. Erlingsson, Pauline Anthonysamy, G. Hogben
Users of mobile apps sometimes express discomfort or concerns with what they see as unnecessary or intrusive permission requests by certain apps. However encouraging mobile app developers to request fewer permissions is challenging because there are many reasons why permissions are requested; furthermore, prior work [25] has shown it is hard to disambiguate the purpose of a particular permission with high certainty. In this work we describe a novel, algorithmic mechanism intended to discourage mobile-app developers from asking for unnecessary permissions. Developers are incentivized by an automated alert, or "nudge", shown in the Google Play Console when their apps ask for permissions that are requested by very few functionally-similar apps---in other words, by their competition. Empirically, this incentive is effective, with significant developer response since its deployment. Permissions have been redacted by 59% of apps that were warned, and this attenuation has occurred broadly across both app categories and app popularity levels. Importantly, billions of users' app installs from the Google Play have benefited from these redactions.
{"title":"Reducing Permission Requests in Mobile Apps","authors":"Sai Teja Peddinti, Igor Bilogrevic, N. Taft, M. Pelikán, Ú. Erlingsson, Pauline Anthonysamy, G. Hogben","doi":"10.1145/3355369.3355584","DOIUrl":"https://doi.org/10.1145/3355369.3355584","url":null,"abstract":"Users of mobile apps sometimes express discomfort or concerns with what they see as unnecessary or intrusive permission requests by certain apps. However encouraging mobile app developers to request fewer permissions is challenging because there are many reasons why permissions are requested; furthermore, prior work [25] has shown it is hard to disambiguate the purpose of a particular permission with high certainty. In this work we describe a novel, algorithmic mechanism intended to discourage mobile-app developers from asking for unnecessary permissions. Developers are incentivized by an automated alert, or \"nudge\", shown in the Google Play Console when their apps ask for permissions that are requested by very few functionally-similar apps---in other words, by their competition. Empirically, this incentive is effective, with significant developer response since its deployment. Permissions have been redacted by 59% of apps that were warned, and this attenuation has occurred broadly across both app categories and app popularity levels. Importantly, billions of users' app installs from the Google Play have benefited from these redactions.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"19 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72616973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes Naab, Patrick Sattler, Jonas Jelten, Oliver Gasser, G. Carle
Domain-based top lists such as the Alexa Top 1M strive to portray the popularity of web domains. Even though their shortcomings (e.g., instability, no aggregation, lack of weights) have been pointed out, domain-based top lists still are an important element of Internet measurement studies. In this paper we present the concept of prefix top lists, which ameliorate some of the shortcomings, while providing insights into the importance of addresses of domain-based top lists. With prefix top lists we aggregate domain-based top lists into network prefixes and apply a Zipf distribution to assign weights to each prefix. In our analysis we find that different domain-based top lists provide differentiated views on Internet prefixes. In addition, we observe very small weight changes over time. We leverage prefix top lists to conduct an evaluation of the DNS to classify the deployment quality of domains. We show that popular domains adhere to name server recommendations for IPv4, but IPv6 compliance is still lacking. Finally, we provide these enhanced and more stable prefix top lists to fellow researchers which can use them to obtain more representative measurement results.
基于域名的顶级列表,如Alexa top 1M,努力描绘网络域名的受欢迎程度。尽管他们的缺点(例如,不稳定,不聚合,缺乏权重)已经被指出,基于域的顶级列表仍然是互联网测量研究的一个重要元素。在本文中,我们提出了前缀顶级列表的概念,它改善了一些缺点,同时提供了基于域的顶级列表地址的重要性的见解。通过前缀顶级列表,我们将基于域的顶级列表聚合到网络前缀中,并应用Zipf分布为每个前缀分配权重。在我们的分析中,我们发现不同的域名top list对互联网前缀提供了不同的看法。此外,我们观察到体重随时间的变化非常小。我们利用前缀顶级列表对DNS进行评估,以对域的部署质量进行分类。我们表明,流行的域名坚持IPv4的名称服务器建议,但IPv6的合规性仍然缺乏。最后,我们将这些增强的、更稳定的前缀top列表提供给其他研究人员,他们可以使用它们来获得更有代表性的测量结果。
{"title":"Prefix Top Lists: Gaining Insights with Prefixes from Domain-based Top Lists on DNS Deployment","authors":"Johannes Naab, Patrick Sattler, Jonas Jelten, Oliver Gasser, G. Carle","doi":"10.1145/3355369.3355598","DOIUrl":"https://doi.org/10.1145/3355369.3355598","url":null,"abstract":"Domain-based top lists such as the Alexa Top 1M strive to portray the popularity of web domains. Even though their shortcomings (e.g., instability, no aggregation, lack of weights) have been pointed out, domain-based top lists still are an important element of Internet measurement studies. In this paper we present the concept of prefix top lists, which ameliorate some of the shortcomings, while providing insights into the importance of addresses of domain-based top lists. With prefix top lists we aggregate domain-based top lists into network prefixes and apply a Zipf distribution to assign weights to each prefix. In our analysis we find that different domain-based top lists provide differentiated views on Internet prefixes. In addition, we observe very small weight changes over time. We leverage prefix top lists to conduct an evaluation of the DNS to classify the deployment quality of domains. We show that popular domains adhere to name server recommendations for IPv4, but IPv6 compliance is still lacking. Finally, we provide these enhanced and more stable prefix top lists to fellow researchers which can use them to obtain more representative measurement results.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76729521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taejoong Chung, E. Aben, Tim Bruijnzeels, B. Chandrasekaran, D. Choffnes, Dave Levin, B. Maggs, A. Mislove, R. V. Rijswijk-Deij, John P. Rula, N. Sullivan
Despite its critical role in Internet connectivity, the Border Gateway Protocol (BGP) remains highly vulnerable to attacks such as prefix hijacking, where an Autonomous System (AS) announces routes for IP space it does not control. To address this issue, the Resource Public Key Infrastructure (RPKI) was developed starting in 2008, with deployment beginning in 2011. This paper performs the first comprehensive, longitudinal study of the deployment, coverage, and quality of RPKI. We use a unique dataset containing all RPKI Route Origin Authorizations (ROAs) from the moment RPKI was first deployed, more than 8 years ago. We combine this dataset with BGP announcements from more than 3,300 BGP collectors worldwide. Our analysis shows the after a gradual start, RPKI has seen a rapid increase in adoption over the past two years. We also show that although misconfigurations were rampant when RPKI was first deployed (causing many announcements to appear as invalid) they are quite rare today. We develop a taxonomy of invalid RPKI announcements, then quantify their prevalence. We further identify suspicious announcements indicative of prefix hijacking and present case studies of likely hijacks. Overall, we conclude that while misconfigurations still do occur, RPKI is "ready for the big screen," and routing security can be increased by dropping invalid announcements. To foster reproducibility and further studies, we release all RPKI data and the tools we used to analyze it into the public domain.
尽管边界网关协议(BGP)在互联网连接中起着至关重要的作用,但它仍然极易受到前缀劫持等攻击,即自治系统(as)宣布它无法控制的IP空间的路由。为了解决这个问题,资源公钥基础设施(Resource Public Key Infrastructure, RPKI)于2008年开始开发,并于2011年开始部署。本文首次对RPKI的部署、覆盖和质量进行了全面的纵向研究。我们使用一个独特的数据集,其中包含自RPKI首次部署以来的所有路由起源授权(roa),超过8年前。我们将此数据集与来自全球3300多个BGP收集器的BGP公告相结合。我们的分析显示,在经历了一个渐进的开始之后,RPKI在过去两年中得到了快速的普及。我们还指出,尽管在首次部署RPKI时错误配置非常猖獗(导致许多公告显示为无效),但它们在今天已经非常罕见了。我们开发了无效RPKI公告的分类,然后量化它们的流行程度。我们进一步识别指示前缀劫持的可疑公告,并提供可能劫持的案例研究。总的来说,我们得出的结论是,尽管错误配置仍然存在,但RPKI已经“为大屏幕做好了准备”,并且可以通过删除无效通知来提高路由安全性。为了促进可重复性和进一步的研究,我们将所有RPKI数据和我们用于分析它的工具发布到公共领域。
{"title":"RPKI is Coming of Age: A Longitudinal Study of RPKI Deployment and Invalid Route Origins","authors":"Taejoong Chung, E. Aben, Tim Bruijnzeels, B. Chandrasekaran, D. Choffnes, Dave Levin, B. Maggs, A. Mislove, R. V. Rijswijk-Deij, John P. Rula, N. Sullivan","doi":"10.1145/3355369.3355596","DOIUrl":"https://doi.org/10.1145/3355369.3355596","url":null,"abstract":"Despite its critical role in Internet connectivity, the Border Gateway Protocol (BGP) remains highly vulnerable to attacks such as prefix hijacking, where an Autonomous System (AS) announces routes for IP space it does not control. To address this issue, the Resource Public Key Infrastructure (RPKI) was developed starting in 2008, with deployment beginning in 2011. This paper performs the first comprehensive, longitudinal study of the deployment, coverage, and quality of RPKI. We use a unique dataset containing all RPKI Route Origin Authorizations (ROAs) from the moment RPKI was first deployed, more than 8 years ago. We combine this dataset with BGP announcements from more than 3,300 BGP collectors worldwide. Our analysis shows the after a gradual start, RPKI has seen a rapid increase in adoption over the past two years. We also show that although misconfigurations were rampant when RPKI was first deployed (causing many announcements to appear as invalid) they are quite rare today. We develop a taxonomy of invalid RPKI announcements, then quantify their prevalence. We further identify suspicious announcements indicative of prefix hijacking and present case studies of likely hijacks. Overall, we conclude that while misconfigurations still do occur, RPKI is \"ready for the big screen,\" and routing security can be increased by dropping invalid announcements. To foster reproducibility and further studies, we release all RPKI data and the tools we used to analyze it into the public domain.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74536168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}