{"title":"sRDMA: RDMA通用的低开销调度程序","authors":"Xizheng Wang, Shuai Wang, Dan Li","doi":"10.1145/3600061.3600082","DOIUrl":null,"url":null,"abstract":"Remote Direct Memory Access (RDMA) has been widely deployed in data centers to improve application performance. However, the characteristic of RDMA to deliver messages in order cannot meet the emerging requirements of applications for scheduling messages within an RDMA connection, making RDMA unable to be fully utilized. Some works try to schedule the data to be transferred in specific applications before delivering to RDMA, or distribute messages to different connections. However, these approaches tightly couple scheduling logic with application logic and may result in high scheduling overhead. In this paper, we propose sRDMA, a general and low-overhead scheduler working in user-space RDMA driver. sRDMA allows the application to express the expected transfer order to RDMA hardware via work requests (WRs). With priority information in WRs, sRDMA slices and schedules WRs to achieve desired order of message transfer and reduce blocking impact of large messages in the same RDMA connection. Our experiments show that sRDMA can improve the performance of applications, e.g., TensorFlow, by up to , and sRDMA has negligible overhead in terms of CPU and flow throughput.","PeriodicalId":228934,"journal":{"name":"Proceedings of the 7th Asia-Pacific Workshop on Networking","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"sRDMA: A General and Low-Overhead Scheduler for RDMA\",\"authors\":\"Xizheng Wang, Shuai Wang, Dan Li\",\"doi\":\"10.1145/3600061.3600082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote Direct Memory Access (RDMA) has been widely deployed in data centers to improve application performance. However, the characteristic of RDMA to deliver messages in order cannot meet the emerging requirements of applications for scheduling messages within an RDMA connection, making RDMA unable to be fully utilized. Some works try to schedule the data to be transferred in specific applications before delivering to RDMA, or distribute messages to different connections. However, these approaches tightly couple scheduling logic with application logic and may result in high scheduling overhead. In this paper, we propose sRDMA, a general and low-overhead scheduler working in user-space RDMA driver. sRDMA allows the application to express the expected transfer order to RDMA hardware via work requests (WRs). With priority information in WRs, sRDMA slices and schedules WRs to achieve desired order of message transfer and reduce blocking impact of large messages in the same RDMA connection. Our experiments show that sRDMA can improve the performance of applications, e.g., TensorFlow, by up to , and sRDMA has negligible overhead in terms of CPU and flow throughput.\",\"PeriodicalId\":228934,\"journal\":{\"name\":\"Proceedings of the 7th Asia-Pacific Workshop on Networking\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th Asia-Pacific Workshop on Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3600061.3600082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th Asia-Pacific Workshop on Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3600061.3600082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
RDMA (Remote Direct Memory Access)技术被广泛应用于数据中心,以提高应用程序的性能。但是,RDMA按顺序传递消息的特性不能满足应用程序对RDMA连接内消息调度的新需求,无法充分利用RDMA。一些工作尝试在交付到RDMA之前安排在特定应用程序中传输的数据,或者将消息分发到不同的连接。然而,这些方法将调度逻辑与应用程序逻辑紧密耦合,可能导致较高的调度开销。在本文中,我们提出了sRDMA,一个在用户空间RDMA驱动程序中工作的通用的低开销调度程序。sRDMA允许应用程序通过工作请求(wr)向RDMA硬件表达预期的传输顺序。利用wr中的优先级信息,sRDMA对wr进行切片和调度,以实现所需的消息传输顺序,并减少同一RDMA连接中大消息的阻塞影响。我们的实验表明,sRDMA可以将应用程序(例如TensorFlow)的性能提高多达,并且sRDMA在CPU和流量吞吐量方面的开销可以忽略不计。
sRDMA: A General and Low-Overhead Scheduler for RDMA
Remote Direct Memory Access (RDMA) has been widely deployed in data centers to improve application performance. However, the characteristic of RDMA to deliver messages in order cannot meet the emerging requirements of applications for scheduling messages within an RDMA connection, making RDMA unable to be fully utilized. Some works try to schedule the data to be transferred in specific applications before delivering to RDMA, or distribute messages to different connections. However, these approaches tightly couple scheduling logic with application logic and may result in high scheduling overhead. In this paper, we propose sRDMA, a general and low-overhead scheduler working in user-space RDMA driver. sRDMA allows the application to express the expected transfer order to RDMA hardware via work requests (WRs). With priority information in WRs, sRDMA slices and schedules WRs to achieve desired order of message transfer and reduce blocking impact of large messages in the same RDMA connection. Our experiments show that sRDMA can improve the performance of applications, e.g., TensorFlow, by up to , and sRDMA has negligible overhead in terms of CPU and flow throughput.