Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie
{"title":"Partition-Aware Routing to Improve Network Isolation in Infiniband Based Multi-tenant Clusters","authors":"Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie","doi":"10.1109/CCGrid.2015.96","DOIUrl":null,"url":null,"abstract":"InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, isolation of nodes is provided through partitioning. The routing algorithm, however, is unaware of these partitions in the network, Traffic flows belonging to different partitions might share links inside the network fabric. This sharing of intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments like a cloud. In such systems, each tenant should experience predictable network performance, unaffected by the workload of other tenants. In addition, using current routing schemes, routes crossing partition boundaries are considered when distributing routes onto links in the network, despite the fact that these routes will never be used. The result is degraded load-balancing. In this paper, we present a novel partition-aware fat-tree routing algorithm, pFTree. The pFTree algorithm utilizes several mechanisms to provide network-wide isolation of partitions belonging to different tenant groups. Given the available network resources, pFTree starts by isolating partitions at the physical link level, and then moves on to utilize virtual lanes, if needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the de facto standard IB fat-tree routing algorithm.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"28 1","pages":"189-198"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2015.96","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, isolation of nodes is provided through partitioning. The routing algorithm, however, is unaware of these partitions in the network, Traffic flows belonging to different partitions might share links inside the network fabric. This sharing of intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments like a cloud. In such systems, each tenant should experience predictable network performance, unaffected by the workload of other tenants. In addition, using current routing schemes, routes crossing partition boundaries are considered when distributing routes onto links in the network, despite the fact that these routes will never be used. The result is degraded load-balancing. In this paper, we present a novel partition-aware fat-tree routing algorithm, pFTree. The pFTree algorithm utilizes several mechanisms to provide network-wide isolation of partitions belonging to different tenant groups. Given the available network resources, pFTree starts by isolating partitions at the physical link level, and then moves on to utilize virtual lanes, if needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the de facto standard IB fat-tree routing algorithm.