Hossein Golestani, Amirhossein Mirhosseini, T. Wenisch
{"title":"软件数据平面:你不能总是为了赢而旋转","authors":"Hossein Golestani, Amirhossein Mirhosseini, T. Wenisch","doi":"10.1145/3357223.3362737","DOIUrl":null,"url":null,"abstract":"Today's datacenters demand high-performance, energy-efficient software data planes, which are widely used in many areas including fast network packet processing, network function virtualization, high-speed data transfer in storage systems, and I/O virtualization. Modern software data planes bypass OS I/O stacks and rely on cores spinning on user-level queues as a fast notification mechanism. Whereas spin-polling can improve latency and throughput, it entails significant shortcomings, especially when scaling to large numbers of cores/queues. In this paper, we pinpoint and quantify challenges of spin-polling--based software data planes using Intel's Data Plane Development Kit (DPDK) as a representative infrastructure. We characterize four scalability issues of software data planes: (1) Full-tilt spinning cores perform more (useless) polling work when there is less work pending in the queues; (2) Spin-polling scales poorly with the number of polled queues due to processor cache capacity constraints, especially when traffic is unbalanced; (3) Operation rate limits (transactions per second) as well as a Polling Tax (the overhead of polling, which is considerable even when operating at saturation throughput) result in poor core scalability. (4) Whereas shared queues can mitigate load imbalance and head-of-line-blocking, synchronization overheads limit their potential benefits. We identify root causes of these issues and discuss solution directions to improve hardware and software abstractions for better performance, efficiency, and scalability in software data planes.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Software Data Planes: You Can't Always Spin to Win\",\"authors\":\"Hossein Golestani, Amirhossein Mirhosseini, T. Wenisch\",\"doi\":\"10.1145/3357223.3362737\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today's datacenters demand high-performance, energy-efficient software data planes, which are widely used in many areas including fast network packet processing, network function virtualization, high-speed data transfer in storage systems, and I/O virtualization. Modern software data planes bypass OS I/O stacks and rely on cores spinning on user-level queues as a fast notification mechanism. Whereas spin-polling can improve latency and throughput, it entails significant shortcomings, especially when scaling to large numbers of cores/queues. In this paper, we pinpoint and quantify challenges of spin-polling--based software data planes using Intel's Data Plane Development Kit (DPDK) as a representative infrastructure. We characterize four scalability issues of software data planes: (1) Full-tilt spinning cores perform more (useless) polling work when there is less work pending in the queues; (2) Spin-polling scales poorly with the number of polled queues due to processor cache capacity constraints, especially when traffic is unbalanced; (3) Operation rate limits (transactions per second) as well as a Polling Tax (the overhead of polling, which is considerable even when operating at saturation throughput) result in poor core scalability. (4) Whereas shared queues can mitigate load imbalance and head-of-line-blocking, synchronization overheads limit their potential benefits. We identify root causes of these issues and discuss solution directions to improve hardware and software abstractions for better performance, efficiency, and scalability in software data planes.\",\"PeriodicalId\":91949,\"journal\":{\"name\":\"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)\",\"volume\":\"28 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3357223.3362737\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3357223.3362737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Software Data Planes: You Can't Always Spin to Win
Today's datacenters demand high-performance, energy-efficient software data planes, which are widely used in many areas including fast network packet processing, network function virtualization, high-speed data transfer in storage systems, and I/O virtualization. Modern software data planes bypass OS I/O stacks and rely on cores spinning on user-level queues as a fast notification mechanism. Whereas spin-polling can improve latency and throughput, it entails significant shortcomings, especially when scaling to large numbers of cores/queues. In this paper, we pinpoint and quantify challenges of spin-polling--based software data planes using Intel's Data Plane Development Kit (DPDK) as a representative infrastructure. We characterize four scalability issues of software data planes: (1) Full-tilt spinning cores perform more (useless) polling work when there is less work pending in the queues; (2) Spin-polling scales poorly with the number of polled queues due to processor cache capacity constraints, especially when traffic is unbalanced; (3) Operation rate limits (transactions per second) as well as a Polling Tax (the overhead of polling, which is considerable even when operating at saturation throughput) result in poor core scalability. (4) Whereas shared queues can mitigate load imbalance and head-of-line-blocking, synchronization overheads limit their potential benefits. We identify root causes of these issues and discuss solution directions to improve hardware and software abstractions for better performance, efficiency, and scalability in software data planes.