首页 > 最新文献

Proceedings of the 2021 ACM SIGCOMM 2021 Conference最新文献

英文 中文
Concordia: teaching the 5G vRAN to share compute 康科迪亚:教5G vRAN共享计算
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472894
Xenofon Foukas, B. Radunovic
Virtualized Radio Access Network (vRAN) offers a cost-efficient solution for running the 5G RAN as a virtualized network function (VNF) on commodity hardware. The vRAN is more efficient than traditional RANs, as it multiplexes several base station workloads on the same compute hardware. Our measurements show that, whilst this multiplexing provides efficiency gains, more than 50% of the CPU cycles in typical vRAN settings still remain unused. A way to further improve CPU utilization is to collocate the vRAN with general-purpose workloads. However, to maintain performance, vRAN tasks have sub-millisecond latency requirements that have to be met 99.999% of times. We show that this is difficult to achieve with existing systems. We propose Concordia, a userspace deadline scheduling framework for the vRAN on Linux. Concordia builds prediction models using quantile decision trees to predict the worst case execution times of vRAN signal processing tasks. The Concordia scheduler is fast (runs every 20 us) and the prediction models are accurate, enabling the system to reserve a minimum number of cores required for vRAN tasks, leaving the rest for general-purpose workloads. We evaluate Concordia on a commercial-grade reference vRAN platform. We show that it meets the 99.999% reliability requirements and reclaims more than 70% of idle CPU cycles without affecting the RAN performance.
虚拟化无线接入网(virtual Radio Access Network, vRAN)为5G RAN在商用硬件上作为虚拟化网络功能(virtual Network function, VNF)运行提供了一种经济高效的解决方案。vRAN比传统的ran更高效,因为它在相同的计算硬件上复用多个基站工作负载。我们的测量表明,虽然这种多路复用提供了效率提升,但在典型的vRAN设置中,超过50%的CPU周期仍然未被使用。进一步提高CPU利用率的一种方法是将vRAN与通用工作负载搭配使用。然而,为了保持性能,vRAN任务具有亚毫秒级的延迟要求,必须在99.999%的情况下满足这些要求。我们表明,这是很难实现与现有的系统。我们提出了Concordia,一个用于Linux上的vRAN的用户空间最后期限调度框架。Concordia使用分位数决策树构建预测模型,预测vRAN信号处理任务的最坏情况执行时间。Concordia调度器速度快(每20秒运行一次),预测模型准确,使系统能够为vRAN任务保留最少数量的核心,其余的留给通用工作负载。我们在商用级参考vRAN平台上对Concordia进行了评估。我们表明,它满足99.999%的可靠性要求,并在不影响RAN性能的情况下回收70%以上的空闲CPU周期。
{"title":"Concordia: teaching the 5G vRAN to share compute","authors":"Xenofon Foukas, B. Radunovic","doi":"10.1145/3452296.3472894","DOIUrl":"https://doi.org/10.1145/3452296.3472894","url":null,"abstract":"Virtualized Radio Access Network (vRAN) offers a cost-efficient solution for running the 5G RAN as a virtualized network function (VNF) on commodity hardware. The vRAN is more efficient than traditional RANs, as it multiplexes several base station workloads on the same compute hardware. Our measurements show that, whilst this multiplexing provides efficiency gains, more than 50% of the CPU cycles in typical vRAN settings still remain unused. A way to further improve CPU utilization is to collocate the vRAN with general-purpose workloads. However, to maintain performance, vRAN tasks have sub-millisecond latency requirements that have to be met 99.999% of times. We show that this is difficult to achieve with existing systems. We propose Concordia, a userspace deadline scheduling framework for the vRAN on Linux. Concordia builds prediction models using quantile decision trees to predict the worst case execution times of vRAN signal processing tasks. The Concordia scheduler is fast (runs every 20 us) and the prediction models are accurate, enabling the system to reserve a minimum number of cores required for vRAN tasks, leaving the rest for general-purpose workloads. We evaluate Concordia on a commercial-grade reference vRAN platform. We show that it meets the 99.999% reliability requirements and reclaims more than 70% of idle CPU cycles without affecting the RAN performance.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81341931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Solar superstorms: planning for an internet apocalypse 太阳超级风暴:互联网末日计划
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472916
S. Jyothi
Black swan events are hard-to-predict rare events that can significantly alter the course of our lives. The Internet has played a key role in helping us deal with the coronavirus pandemic, a recent black swan event. However, Internet researchers and operators are mostly blind to another black swan event that poses a direct threat to Internet infrastructure. In this paper, we investigate the impact of solar superstorms that can potentially cause large-scale Internet outages covering the entire globe and lasting several months. We discuss the challenges posed by such activity and currently available mitigation techniques. Using real-world datasets, we analyze the robustness of the current Internet infrastructure and show that submarine cables are at greater risk of failure compared to land cables. Moreover, the US has a higher risk for disconnection compared to Asia. Finally, we lay out steps for improving the Internet's resiliency.
黑天鹅事件是难以预测的罕见事件,可以显著改变我们的生活进程。互联网在帮助我们应对冠状病毒大流行这一最近的黑天鹅事件方面发挥了关键作用。然而,互联网研究人员和运营商大多对另一个对互联网基础设施构成直接威胁的黑天鹅事件视而不见。在本文中,我们研究了太阳超级风暴的影响,它可能导致覆盖全球并持续数月的大规模互联网中断。我们将讨论此类活动带来的挑战以及目前可用的缓解技术。使用真实世界的数据集,我们分析了当前互联网基础设施的稳健性,并表明与陆地电缆相比,海底电缆的故障风险更大。此外,与亚洲相比,美国与互联网脱节的风险更高。最后,我们列出了提高互联网弹性的步骤。
{"title":"Solar superstorms: planning for an internet apocalypse","authors":"S. Jyothi","doi":"10.1145/3452296.3472916","DOIUrl":"https://doi.org/10.1145/3452296.3472916","url":null,"abstract":"Black swan events are hard-to-predict rare events that can significantly alter the course of our lives. The Internet has played a key role in helping us deal with the coronavirus pandemic, a recent black swan event. However, Internet researchers and operators are mostly blind to another black swan event that poses a direct threat to Internet infrastructure. In this paper, we investigate the impact of solar superstorms that can potentially cause large-scale Internet outages covering the entire globe and lasting several months. We discuss the challenges posed by such activity and currently available mitigation techniques. Using real-world datasets, we analyze the robustness of the current Internet infrastructure and show that submarine cables are at greater risk of failure compared to land cables. Moreover, the US has a higher risk for disconnection compared to Asia. Finally, we lay out steps for improving the Internet's resiliency.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88213475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
CliqueMap
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472934
Arjun Singhvi, Aditya Akella, Maggie Anderson, R. Cauble, Harshad Deshmukh, D. Gibson, Milo M. K. Martin, Amanda Strominger, T. Wenisch, Amin Vahdat
Distributed in-memory caching is a key component of modern Internet services. Such caches are often accessed via remote procedure call (RPC), as RPC frameworks provide rich support for productionization, including protocol versioning, memory efficiency, auto-scaling, and hitless upgrades. However, full-featured RPC limits performance and scalability as it incurs high latencies and CPU overheads. Remote Memory Access (RMA) offers a promising alternative, but meeting productionization requirements can be a significant challenge with RMA-based systems due to limited programmability and narrow RMA primitives. This paper describes the design, implementation, and experience derived from CliqueMap, a hybrid RMA/RPC caching system. CliqueMap has been in production use in Google's datacenters for over three years, currently serves more than 1PB of DRAM, and underlies several end-user visible services. CliqueMap makes use of performant and efficient RMAs on the critical serving path and judiciously applies RPCs toward other functionality. The design embraces lightweight replication, client-based quoruming, self-validating server responses, per-operation client-side retries, and co-design with the network layers. These foci lead to a system resilient to the rigors of production and frequent post deployment evolution.
{"title":"CliqueMap","authors":"Arjun Singhvi, Aditya Akella, Maggie Anderson, R. Cauble, Harshad Deshmukh, D. Gibson, Milo M. K. Martin, Amanda Strominger, T. Wenisch, Amin Vahdat","doi":"10.1145/3452296.3472934","DOIUrl":"https://doi.org/10.1145/3452296.3472934","url":null,"abstract":"Distributed in-memory caching is a key component of modern Internet services. Such caches are often accessed via remote procedure call (RPC), as RPC frameworks provide rich support for productionization, including protocol versioning, memory efficiency, auto-scaling, and hitless upgrades. However, full-featured RPC limits performance and scalability as it incurs high latencies and CPU overheads. Remote Memory Access (RMA) offers a promising alternative, but meeting productionization requirements can be a significant challenge with RMA-based systems due to limited programmability and narrow RMA primitives. This paper describes the design, implementation, and experience derived from CliqueMap, a hybrid RMA/RPC caching system. CliqueMap has been in production use in Google's datacenters for over three years, currently serves more than 1PB of DRAM, and underlies several end-user visible services. CliqueMap makes use of performant and efficient RMAs on the critical serving path and judiciously applies RPCs toward other functionality. The design embraces lightweight replication, client-based quoruming, self-validating server responses, per-operation client-side retries, and co-design with the network layers. These foci lead to a system resilient to the rigors of production and frequent post deployment evolution.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"233 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77009089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Seven years in the life of Hypergiants' off-nets 在超巨星的网外生活了七年
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472928
Petros Gigis, Matt Calder, Lefteris Manassakis, George Nomikos, Vasileios Kotronis, X. Dimitropoulos, Ethan Katz-Bassett, Georgios Smaragdakis
Content Hypergiants deliver the vast majority of Internet traffic to end users. In recent years, some have invested heavily in deploying services and servers inside end-user networks. With several dozen Hypergiants and thousands of servers deployed inside networks, these off-net (meaning outside the Hypergiant networks) deployments change the structure of the Internet. Previous efforts to study them have relied on proprietary data or specialized per-Hypergiant measurement techniques that neither scale nor generalize, providing a limited view of content delivery on today's Internet. In this paper, we develop a generic and easy to implement methodology to measure the expansion of Hypergiants' off-nets. Our key observation is that Hypergiants increasingly encrypt their traffic to protect their customers' privacy. Thus, we can analyze publicly available Internet-wide scans of port 443 and retrieve TLS certificates to discover which IP addresses host Hypergiant certificates in order to infer the networks hosting off-nets for the corresponding Hypergiants. Our results show that the number of networks hosting Hypergiant off-nets has tripled from 2013 to 2021, reaching 4.5k networks. The largest Hypergiants dominate these deployments, with almost all of these networks hosting an off-net for at least one -- and increasingly two or more -- of Google, Netflix, Facebook, or Akamai. These four Hypergiants have off-nets within networks that provide access to a significant fraction of end user population.
内容超级巨头向最终用户提供了绝大多数互联网流量。近年来,一些公司在最终用户网络中投入巨资部署服务和服务器。由于在网络内部部署了几十台Hypergiants和数千台服务器,这些离网(即在Hypergiant网络之外)部署改变了Internet的结构。以前对它们的研究依赖于专有数据或专门的超巨型测量技术,既不能扩展也不能泛化,对当今互联网上的内容交付提供了有限的看法。在本文中,我们开发了一种通用且易于实现的方法来测量Hypergiants的离网扩展。我们的主要观察是,超级巨头越来越多地加密他们的流量,以保护他们的客户隐私。因此,我们可以分析端口443的公开可用的internet范围扫描并检索TLS证书,以发现哪些IP地址承载了Hypergiant证书,从而推断出承载相应Hypergiants的网外网络。我们的结果表明,从2013年到2021年,托管Hypergiant离网的网络数量增加了两倍,达到4.5万个网络。最大的超级巨头主导着这些部署,几乎所有这些网络都至少为谷歌、Netflix、Facebook或Akamai的一个(越来越多的是两个或更多)托管离网服务。这四个超级巨头在网络中都有离网,为很大一部分最终用户提供访问。
{"title":"Seven years in the life of Hypergiants' off-nets","authors":"Petros Gigis, Matt Calder, Lefteris Manassakis, George Nomikos, Vasileios Kotronis, X. Dimitropoulos, Ethan Katz-Bassett, Georgios Smaragdakis","doi":"10.1145/3452296.3472928","DOIUrl":"https://doi.org/10.1145/3452296.3472928","url":null,"abstract":"Content Hypergiants deliver the vast majority of Internet traffic to end users. In recent years, some have invested heavily in deploying services and servers inside end-user networks. With several dozen Hypergiants and thousands of servers deployed inside networks, these off-net (meaning outside the Hypergiant networks) deployments change the structure of the Internet. Previous efforts to study them have relied on proprietary data or specialized per-Hypergiant measurement techniques that neither scale nor generalize, providing a limited view of content delivery on today's Internet. In this paper, we develop a generic and easy to implement methodology to measure the expansion of Hypergiants' off-nets. Our key observation is that Hypergiants increasingly encrypt their traffic to protect their customers' privacy. Thus, we can analyze publicly available Internet-wide scans of port 443 and retrieve TLS certificates to discover which IP addresses host Hypergiant certificates in order to infer the networks hosting off-nets for the corresponding Hypergiants. Our results show that the number of networks hosting Hypergiant off-nets has tripled from 2013 to 2021, reaching 4.5k networks. The largest Hypergiants dominate these deployments, with almost all of these networks hosting an off-net for at least one -- and increasingly two or more -- of Google, Netflix, Facebook, or Akamai. These four Hypergiants have off-nets within networks that provide access to a significant fraction of end user population.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83281092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Verifying learning-augmented systems 验证学习增强系统
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472936
Tomer Eliyahu, Yafim Kazak, Guy Katz, Michael Schapira
The application of deep reinforcement learning (DRL) to computer and networked systems has recently gained significant popularity. However, the obscurity of decisions by DRL policies renders it hard to ascertain that learning-augmented systems are safe to deploy, posing a significant obstacle to their real-world adoption. We observe that specific characteristics of recent applications of DRL to systems contexts give rise to an exciting opportunity: applying formal verification to establish that a given system provably satisfies designer/user-specified requirements, or to expose concrete counter-examples. We present whiRL, a platform for verifying DRL policies for systems, which combines recent advances in the verification of deep neural networks with scalable model checking techniques. To exemplify its usefulness, we employ whiRL to verify natural equirements from recently introduced learning-augmented systems for three real-world environments: Internet congestion control, adaptive video streaming, and job scheduling in compute clusters. Our evaluation shows that whiRL is capable of guaranteeing that natural requirements from these systems are satisfied, and of exposing specific scenarios in which other basic requirements are not.
深度强化学习(DRL)在计算机和网络系统中的应用最近获得了显著的普及。然而,DRL策略决策的模糊性使得很难确定学习增强系统的部署是否安全,这对它们在现实世界中的应用构成了重大障碍。我们观察到,最近DRL在系统环境中的应用的具体特征带来了一个令人兴奋的机会:应用正式验证来建立一个给定的系统可证明地满足设计师/用户指定的需求,或者暴露具体的反例。我们提出了whiRL,一个验证系统DRL策略的平台,它结合了深度神经网络验证和可扩展模型检查技术的最新进展。为了举例说明其实用性,我们使用whiRL来验证最近引入的学习增强系统在三个现实环境中的自然需求:互联网拥塞控制、自适应视频流和计算集群中的作业调度。我们的评估表明,whiRL能够保证这些系统的自然需求得到满足,并且能够暴露其他基本需求无法满足的特定场景。
{"title":"Verifying learning-augmented systems","authors":"Tomer Eliyahu, Yafim Kazak, Guy Katz, Michael Schapira","doi":"10.1145/3452296.3472936","DOIUrl":"https://doi.org/10.1145/3452296.3472936","url":null,"abstract":"The application of deep reinforcement learning (DRL) to computer and networked systems has recently gained significant popularity. However, the obscurity of decisions by DRL policies renders it hard to ascertain that learning-augmented systems are safe to deploy, posing a significant obstacle to their real-world adoption. We observe that specific characteristics of recent applications of DRL to systems contexts give rise to an exciting opportunity: applying formal verification to establish that a given system provably satisfies designer/user-specified requirements, or to expose concrete counter-examples. We present whiRL, a platform for verifying DRL policies for systems, which combines recent advances in the verification of deep neural networks with scalable model checking techniques. To exemplify its usefulness, we employ whiRL to verify natural equirements from recently introduced learning-augmented systems for three real-world environments: Internet congestion control, adaptive video streaming, and job scheduling in compute clusters. Our evaluation shows that whiRL is capable of guaranteeing that natural requirements from these systems are satisfied, and of exposing specific scenarios in which other basic requirements are not.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83374289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Snowcap: synthesizing network-wide configuration updates Snowcap:综合全网配置更新
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472915
Tibor Schneider, Rüdiger Birkner, L. Vanbever
Large-scale reconfiguration campaigns tend to be nerve-racking for network operators as they can lead to significant network downtimes, decreased performance, and policy violations. Unfortunately, existing reconfiguration frameworks often fall short in practice as they either only support a small set of reconfiguration scenarios or simply do not scale. We address these problems with Snowcap, the first network reconfiguration framework which can synthesize configuration updates that comply with arbitrary hard and soft specifications, and involve arbitrary routing protocols. Our key contribution is an efficient search procedure which leverages counter-examples to efficiently navigate the space of configuration updates. Given a reconfiguration ordering which violates the desired specifications, our algorithm automatically identifies the problematic commands so that it can avoid this particular order in the next iteration. We fully implemented Snowcap and extensively evaluated its scalability and effectiveness on real-world topologies and typical, large-scale reconfiguration scenarios. Even for large topologies, Snowcap finds a valid reconfiguration ordering with minimal side-effects (i.e., traffic shifts) within a few seconds at most.
对于网络运营商来说,大规模的重新配置活动往往是伤脑筋的,因为它们可能导致严重的网络停机、性能下降和策略违反。不幸的是,现有的重新配置框架在实践中往往不足,因为它们要么只支持一小部分重新配置场景,要么根本无法扩展。我们用Snowcap解决了这些问题,Snowcap是第一个网络重构框架,它可以合成符合任意软硬规范的配置更新,并涉及任意路由协议。我们的主要贡献是一个有效的搜索过程,它利用反例来有效地导航配置更新的空间。给定一个违反期望规范的重新配置顺序,我们的算法会自动识别出有问题的命令,以便在下一次迭代中避免这种特定的顺序。我们完全实现了Snowcap,并在实际拓扑和典型的大规模重构场景中广泛评估了其可扩展性和有效性。即使对于大型拓扑,Snowcap也能在几秒钟内找到副作用最小(即流量转移)的有效重新配置顺序。
{"title":"Snowcap: synthesizing network-wide configuration updates","authors":"Tibor Schneider, Rüdiger Birkner, L. Vanbever","doi":"10.1145/3452296.3472915","DOIUrl":"https://doi.org/10.1145/3452296.3472915","url":null,"abstract":"Large-scale reconfiguration campaigns tend to be nerve-racking for network operators as they can lead to significant network downtimes, decreased performance, and policy violations. Unfortunately, existing reconfiguration frameworks often fall short in practice as they either only support a small set of reconfiguration scenarios or simply do not scale. We address these problems with Snowcap, the first network reconfiguration framework which can synthesize configuration updates that comply with arbitrary hard and soft specifications, and involve arbitrary routing protocols. Our key contribution is an efficient search procedure which leverages counter-examples to efficiently navigate the space of configuration updates. Given a reconfiguration ordering which violates the desired specifications, our algorithm automatically identifies the problematic commands so that it can avoid this particular order in the next iteration. We fully implemented Snowcap and extensively evaluated its scalability and effectiveness on real-world topologies and typical, large-scale reconfiguration scenarios. Even for large topologies, Snowcap finds a valid reconfiguration ordering with minimal side-effects (i.e., traffic shifts) within a few seconds at most.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81901202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Capacity-efficient and uncertainty-resilient backbone network planning with hose 带软管的容量效率和不确定性弹性骨干网规划
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472918
S. Ahuja, Varun Gupta, V. Dangui, Soshant Bali, A. Gopalan, Hao Zhong, Petr Lapukhov, Yiting Xia, Ying Zhang
This paper presents Facebook's design and operational experience of a Hose-based backbone network planning system. This initial adoption of the Hose model in network planning is driven by the capacity and demand uncertainty pressure of backbone expansion. Since the Hose model abstracts the aggregated traffic demand per site, peak traffic flows at different times can be multiplexed to save capacity and buffer traffic spikes. Our core design involves heuristic algorithms to select Hose-compliant traffic matrices and cross-layer optimization between the optical and IP networks. We evaluate the system performance in production and share insights from years of production experience. Hose-based network planning can save 17.4% capacity and drops 75% less traffic under fiber cuts. As the first study of Hose in network planning, our work has the potential to inspire follow-up research.
本文介绍了Facebook基于软管的骨干网规划系统的设计和运行经验。由于主干网扩容带来的容量和需求的不确定性压力,网络规划中最初采用Hose模型。由于Hose模型抽象了每个站点的聚合流量需求,因此可以将不同时间的高峰流量复用以节省容量并缓冲流量峰值。我们的核心设计包括启发式算法来选择软管兼容的流量矩阵以及光网络和IP网络之间的跨层优化。我们评估系统在生产中的性能,并分享多年生产经验的见解。基于软管的网络规划可以节省17.4%的容量,在光纤切断的情况下减少75%的流量。作为网络规划中软管的首次研究,我们的工作具有启发后续研究的潜力。
{"title":"Capacity-efficient and uncertainty-resilient backbone network planning with hose","authors":"S. Ahuja, Varun Gupta, V. Dangui, Soshant Bali, A. Gopalan, Hao Zhong, Petr Lapukhov, Yiting Xia, Ying Zhang","doi":"10.1145/3452296.3472918","DOIUrl":"https://doi.org/10.1145/3452296.3472918","url":null,"abstract":"This paper presents Facebook's design and operational experience of a Hose-based backbone network planning system. This initial adoption of the Hose model in network planning is driven by the capacity and demand uncertainty pressure of backbone expansion. Since the Hose model abstracts the aggregated traffic demand per site, peak traffic flows at different times can be multiplexed to save capacity and buffer traffic spikes. Our core design involves heuristic algorithms to select Hose-compliant traffic matrices and cross-layer optimization between the optical and IP networks. We evaluate the system performance in production and share insights from years of production experience. Hose-based network planning can save 17.4% capacity and drops 75% less traffic under fiber cuts. As the first study of Hose in network planning, our work has the potential to inspire follow-up research.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90204451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
LAVA 熔岩
Pub Date : 2021-08-09 DOI: 10.1007/978-3-540-72816-0_12856
R. I. Zelaya, W. Sussman, Jeremy Gummeson, Kyle Jamieson, Wenjun Hu
{"title":"LAVA","authors":"R. I. Zelaya, W. Sussman, Jeremy Gummeson, Kyle Jamieson, Wenjun Hu","doi":"10.1007/978-3-540-72816-0_12856","DOIUrl":"https://doi.org/10.1007/978-3-540-72816-0_12856","url":null,"abstract":"","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86180528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing data center networks using bottleneck structures 使用瓶颈结构设计数据中心网络
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472898
Jordi Ros-Giralt, Noah Amsel, Sruthi Yellamraju, J. Ezick, R. Lethin, Yuang Jiang, Aosong Feng, L. Tassiulas, Zhenguo Wu, Min Yee Teh, K. Bergman
This paper provides a mathematical model of data center performance based on the recently introduced Quantitative Theory of Bottleneck Structures (QTBS). Using the model, we prove that if the traffic pattern is textit{interference-free}, there exists a unique optimal design that both minimizes maximum flow completion time and yields maximal system-wide throughput. We show that interference-free patterns correspond to the important set of patterns that display data locality properties and use these theoretical insights to study three widely used interconnects---fat-trees, folded-Clos and dragonfly topologies. We derive equations that describe the optimal design for each interconnect as a function of the traffic pattern. Our model predicts, for example, that a 3-level folded-Clos interconnect with radix 24 that routes 10% of the traffic through the spine links can reduce the number of switches and cabling at the core layer by 25% without any performance penalty. We present experiments using production TCP/IP code to empirically validate the results and provide tables for network designers to identify optimal designs as a function of the size of the interconnect and traffic pattern.
本文在瓶颈结构定量理论(QTBS)的基础上,提出了数据中心性能的数学模型。利用该模型,我们证明了如果交通模式是textit{无干扰}的,存在一个唯一的最优设计,使最大流量完成时间最小化,并产生最大的系统范围吞吐量。我们表明无干扰模式对应于显示数据局域性的重要模式集,并使用这些理论见解来研究三种广泛使用的互连-脂肪树,折叠clos和蜻蜓拓扑结构。我们推导出描述每个互连的最优设计作为交通模式函数的方程。例如,我们的模型预测,基数为24的3级折叠clos互连通过主干链路路由10%的流量,可以在不影响性能的情况下将核心层的交换机和布线数量减少25%。我们提出了使用生产TCP/IP代码的实验,以经验验证结果,并为网络设计者提供表格,以确定作为互连大小和流量模式的函数的最佳设计。
{"title":"Designing data center networks using bottleneck structures","authors":"Jordi Ros-Giralt, Noah Amsel, Sruthi Yellamraju, J. Ezick, R. Lethin, Yuang Jiang, Aosong Feng, L. Tassiulas, Zhenguo Wu, Min Yee Teh, K. Bergman","doi":"10.1145/3452296.3472898","DOIUrl":"https://doi.org/10.1145/3452296.3472898","url":null,"abstract":"This paper provides a mathematical model of data center performance based on the recently introduced Quantitative Theory of Bottleneck Structures (QTBS). Using the model, we prove that if the traffic pattern is textit{interference-free}, there exists a unique optimal design that both minimizes maximum flow completion time and yields maximal system-wide throughput. We show that interference-free patterns correspond to the important set of patterns that display data locality properties and use these theoretical insights to study three widely used interconnects---fat-trees, folded-Clos and dragonfly topologies. We derive equations that describe the optimal design for each interconnect as a function of the traffic pattern. Our model predicts, for example, that a 3-level folded-Clos interconnect with radix 24 that routes 10% of the traffic through the spine links can reduce the number of switches and cabling at the core layer by 25% without any performance penalty. We present experiments using production TCP/IP code to empirically validate the results and provide tables for network designers to identify optimal designs as a function of the size of the interconnect and traffic pattern.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86194552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
From IP to transport and beyond: cross-layer attacks against applications 从IP到传输及其他:针对应用程序的跨层攻击
Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472933
Tianxiang Dai, Philipp Jeitner, Haya Shulman, M. Waidner
We perform the first analysis of methodologies for launching DNS cache poisoning: manipulation at the IP layer, hijack of the inter-domain routing and probing open ports via side channels. We evaluate these methodologies against DNS resolvers in the Internet and compare them with respect to effectiveness, applicability and stealth. Our study shows that DNS cache poisoning is a practical and pervasive threat. We then demonstrate cross-layer attacks that leverage DNS cache poisoning for attacking popular systems, ranging from security mechanisms, such as RPKI, to applications, such as VoIP. In addition to more traditional adversarial goals, most notably impersonation and Denial of Service, we show for the first time that DNS cache poisoning can even enable adversaries to bypass cryptographic defences: we demonstrate how DNS cache poisoning can facilitate BGP prefix hijacking of networks protected with RPKI even when all the other networks apply route origin validation to filter invalid BGP announcements. Our study shows that DNS plays a much more central role in the Internet security than previously assumed. We recommend mitigations for securing the applications and for preventing cache poisoning.
我们对启动DNS缓存中毒的方法进行了首次分析:在IP层操纵,劫持域间路由和通过侧通道探测开放端口。我们将这些方法与互联网上的DNS解析器进行比较,并比较它们的有效性、适用性和隐蔽性。我们的研究表明,DNS缓存中毒是一种实际而普遍的威胁。然后,我们演示了利用DNS缓存中毒攻击流行系统的跨层攻击,范围从安全机制(如RPKI)到应用程序(如VoIP)。除了更传统的对抗性目标,最明显的是模仿和拒绝服务,我们首次展示了DNS缓存中毒甚至可以使攻击者绕过加密防御:我们演示了DNS缓存中毒如何促进BGP前缀劫持受RPKI保护的网络,即使所有其他网络都应用路由来源验证来过滤无效的BGP公告。我们的研究表明,DNS在互联网安全中扮演的角色比以前认为的要重要得多。我们建议采用缓解措施来保护应用程序并防止缓存中毒。
{"title":"From IP to transport and beyond: cross-layer attacks against applications","authors":"Tianxiang Dai, Philipp Jeitner, Haya Shulman, M. Waidner","doi":"10.1145/3452296.3472933","DOIUrl":"https://doi.org/10.1145/3452296.3472933","url":null,"abstract":"We perform the first analysis of methodologies for launching DNS cache poisoning: manipulation at the IP layer, hijack of the inter-domain routing and probing open ports via side channels. We evaluate these methodologies against DNS resolvers in the Internet and compare them with respect to effectiveness, applicability and stealth. Our study shows that DNS cache poisoning is a practical and pervasive threat. We then demonstrate cross-layer attacks that leverage DNS cache poisoning for attacking popular systems, ranging from security mechanisms, such as RPKI, to applications, such as VoIP. In addition to more traditional adversarial goals, most notably impersonation and Denial of Service, we show for the first time that DNS cache poisoning can even enable adversaries to bypass cryptographic defences: we demonstrate how DNS cache poisoning can facilitate BGP prefix hijacking of networks protected with RPKI even when all the other networks apply route origin validation to filter invalid BGP announcements. Our study shows that DNS plays a much more central role in the Internet security than previously assumed. We recommend mitigations for securing the applications and for preventing cache poisoning.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83814480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
Proceedings of the 2021 ACM SIGCOMM 2021 Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1