Understanding I/O Direct Cache Access Performance for End Host Networking

Proceedings of the ACM on Measurement and Analysis of Computing Systems Pub Date : 2022-02-24 DOI:10.1145/3508042

Minhu Wang, Mingwei Xu, Jianping Wu

{"title":"Understanding I/O Direct Cache Access Performance for End Host Networking","authors":"Minhu Wang, Mingwei Xu, Jianping Wu","doi":"10.1145/3508042","DOIUrl":null,"url":null,"abstract":"Direct Cache Access (DCA) enables a network interface card (NIC) to load and store data directly on the processor cache, as conventional Direct Memory Access (DMA) is no longer suitable as the bridge between NIC and CPU in the era of 100 Gigabit Ethernet. As numerous I/O devices and cores compete for scarce cache resources, making the most of DCA for networking applications with varied objectives and constraints is a challenge, especially given the increasing complexity of modern cache hardware and I/O stacks. In this paper, we reverse engineer details of one commercial implementation of DCA, Intel's Data Direct I/O (DDIO), to explicate the importance of hardware-level investigation into DCA. Based on the learned knowledge of DCA and network I/O stacks, we (1) develop an analytical framework to predict the effectiveness of DCA (i.e., its hit rate) under certain hardware specifications, system configurations, and application properties; (2) measure penalties of the ineffective use of DCA (i.e., its miss penalty) to characterize its benefits; and (3) show that our reverse engineering, measurement, and model contribute to a deeper understanding of DCA, which in turn helps diagnose, optimize, and design end-host networking.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Direct Cache Access (DCA) enables a network interface card (NIC) to load and store data directly on the processor cache, as conventional Direct Memory Access (DMA) is no longer suitable as the bridge between NIC and CPU in the era of 100 Gigabit Ethernet. As numerous I/O devices and cores compete for scarce cache resources, making the most of DCA for networking applications with varied objectives and constraints is a challenge, especially given the increasing complexity of modern cache hardware and I/O stacks. In this paper, we reverse engineer details of one commercial implementation of DCA, Intel's Data Direct I/O (DDIO), to explicate the importance of hardware-level investigation into DCA. Based on the learned knowledge of DCA and network I/O stacks, we (1) develop an analytical framework to predict the effectiveness of DCA (i.e., its hit rate) under certain hardware specifications, system configurations, and application properties; (2) measure penalties of the ineffective use of DCA (i.e., its miss penalty) to characterize its benefits; and (3) show that our reverse engineering, measurement, and model contribute to a deeper understanding of DCA, which in turn helps diagnose, optimize, and design end-host networking.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

了解终端主机组网中I/O直接缓存访问性能

DCA (Direct Cache Access)是指在千兆以太网时代，传统的DMA (Direct Memory Access)方式已经不能作为连接网卡和CPU的桥梁，而DCA (Direct Cache Access)技术可以直接将数据加载到处理器缓存中并存储。由于大量I/O设备和核心争夺稀缺的缓存资源，因此为具有不同目标和约束的网络应用程序充分利用DCA是一项挑战，特别是考虑到现代缓存硬件和I/O堆栈日益复杂。在本文中，我们对DCA的一个商业实现，英特尔的数据直接I/O (DDIO)的细节进行了逆向工程，以说明硬件级研究DCA的重要性。基于所学到的DCA和网络I/O堆栈知识，我们(1)开发了一个分析框架来预测DCA在某些硬件规格、系统配置和应用程序属性下的有效性(即命中率);(2)衡量无效使用DCA的处罚(即未命中处罚)，以表征其效益;(3)表明我们的逆向工程、测量和模型有助于更深入地理解DCA，这反过来有助于诊断、优化和设计终端主机网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the ACM on Measurement and Analysis of Computing Systems

CiteScore

3.20

自引率

0.00%

发文量