Bridging the gap between applications and networks in data centers

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2013-01-29 DOI:10.1145/2433140.2433143

Paolo Costa

{"title":"Bridging the gap between applications and networks in data centers","authors":"Paolo Costa","doi":"10.1145/2433140.2433143","DOIUrl":null,"url":null,"abstract":"Modern data centers host tens (if not hundreds) of thousands of servers and are used by companies such as Amazon, Google, and Microsoft to provide online services to millions of individuals distributed across the Internet. They use commodity hardware and their network infrastructure adopts principles evolved from enterprise and Internet networking. Applications use UDP datagrams or TCP sockets as the primary interface to other applications running inside the data center. This effectively isolates the network from the end-systems, which then have little control over how the network handles packets. Likewise, the network has limited visibility on the application logic. An application injects a packet with a destination address and the network just delivers the packet. Network and applications effectively treat each other as black-boxes. This strict separation between applications and networks (also referred to as dumb network) is a direct outcome of the so-called end-to-end argument [49] and has arguably been one of the main reasons why the Internet has been capable of evolving from a small research project to planetary scale, supporting a multitude of different hardware and network technologies as well as a slew of very diverse applications, and using networks owned by competing ISPs. Despite being so instrumental in the success of the Internet, this black-box design is also one of the root causes of inefficiencies in large-scale data centers. Given the little control and visibility over network resources, applications need to use low-level hacks, e.g., to extract network properties (e.g., using traceroute and IP addresses to infer the network topology) and to prioritize traffic (e.g., increasing the number of TCP flows used by an application to increase its bandwidth share). Further, a simple functionality like multicast or anycast routing is not available and developers must resort to application-level overlays. This, however, leads to inefficiencies as typically multiple logical links are mapped to the same physical link, significantly reducing application throughput. Even with perfect knowledge of the underlying topology, there is still the constraint that servers","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"28 1","pages":"3-8"},"PeriodicalIF":0.0000,"publicationDate":"2013-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGOPS Oper. Syst. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2433140.2433143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Modern data centers host tens (if not hundreds) of thousands of servers and are used by companies such as Amazon, Google, and Microsoft to provide online services to millions of individuals distributed across the Internet. They use commodity hardware and their network infrastructure adopts principles evolved from enterprise and Internet networking. Applications use UDP datagrams or TCP sockets as the primary interface to other applications running inside the data center. This effectively isolates the network from the end-systems, which then have little control over how the network handles packets. Likewise, the network has limited visibility on the application logic. An application injects a packet with a destination address and the network just delivers the packet. Network and applications effectively treat each other as black-boxes. This strict separation between applications and networks (also referred to as dumb network) is a direct outcome of the so-called end-to-end argument [49] and has arguably been one of the main reasons why the Internet has been capable of evolving from a small research project to planetary scale, supporting a multitude of different hardware and network technologies as well as a slew of very diverse applications, and using networks owned by competing ISPs. Despite being so instrumental in the success of the Internet, this black-box design is also one of the root causes of inefficiencies in large-scale data centers. Given the little control and visibility over network resources, applications need to use low-level hacks, e.g., to extract network properties (e.g., using traceroute and IP addresses to infer the network topology) and to prioritize traffic (e.g., increasing the number of TCP flows used by an application to increase its bandwidth share). Further, a simple functionality like multicast or anycast routing is not available and developers must resort to application-level overlays. This, however, leads to inefficiencies as typically multiple logical links are mapped to the same physical link, significantly reducing application throughput. Even with perfect knowledge of the underlying topology, there is still the constraint that servers

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

弥合数据中心中应用程序和网络之间的差距

现代数据中心托管着数万(如果不是数十万)台服务器，Amazon、Google和Microsoft等公司使用这些数据中心向分布在Internet上的数百万个人提供在线服务。他们使用商品硬件，他们的网络基础设施采用从企业和Internet网络发展而来的原则。应用程序使用UDP数据报或TCP套接字作为与数据中心内运行的其他应用程序的主要接口。这有效地将网络与终端系统隔离开来，终端系统几乎无法控制网络如何处理数据包。同样，网络对应用程序逻辑的可见性也有限。一个应用程序注入一个带有目的地址的数据包，网络只是发送数据包。网络和应用程序有效地将彼此视为黑盒。这种应用程序和网络之间的严格分离(也被称为哑网络)是所谓的端到端争论的直接结果[49]，可以说是互联网能够从一个小型研究项目发展到全球规模的主要原因之一，支持多种不同的硬件和网络技术以及大量非常多样化的应用程序，并使用竞争的isp拥有的网络。尽管在互联网的成功中发挥了重要作用，但这种黑箱设计也是导致大型数据中心效率低下的根本原因之一。由于对网络资源的控制和可见性很小，应用程序需要使用低级黑客，例如，提取网络属性(例如，使用traceroute和IP地址来推断网络拓扑)和对流量进行优先级排序(例如，增加应用程序使用的TCP流的数量以增加其带宽共享)。此外，像多播或任意播路由这样的简单功能是不可用的，开发人员必须求助于应用程序级别的覆盖。但是，这会导致效率低下，因为通常会将多个逻辑链路映射到相同的物理链路，从而大大降低了应用程序吞吐量。即使完全了解底层拓扑，仍然存在服务器的约束

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM SIGOPS Oper. Syst. Rev.

自引率

0.00%

发文量

期刊最新文献

Protection Bringing Platform Harmony to VMware NSX Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds ARM Virtualization Hardware Translation Coherence for Virtualized Systems