High speed interconnects are critical to provide robust and highly efficient services to every user in a cluster. Several commercial offerings – many of which now firmly established in the market – have arisen throughout the years, spanning the very many possible tradeoffs between cost, reconfigurability, performance, resiliency and support for a variety of processing architectures. On the other hand, custom interconnects may represent an appealing solution for applications requiring cost-effectiveness, customizability and flexibility.
In this regard, the APEnet project was started in 2003, focusing on the design of PCIe FPGA-based custom Network Interface Cards (NIC) for cluster interconnects with a 3D torus topology. In this work, we highlight the main features of APEnetX, the latest version of the APEnet NIC. Designed on the Xilinx Alveo U200 card, it implements Remote Direct Memory Access (RDMA) transactions using both Xilinx Ultrascale+ IPs and custom hardware and software components to ensure efficient data transfer without the involvement of the host operating system. The software stack lets the user interface with the NIC directly via a low level driver or through a plug-in for the OpenMPI stack, aligning our NIC to the application layer standards in the HPC community. The APEnetX architecture integrates a Quality-of-Service (QoS) scheme implementation, in order to enforce some level of performance during network congestion events. Finally, APEnetX is accompanied by an Omnet++ based simulator which enables probing the performance of the network when its size is pushed to numbers of nodes otherwise unattainable for cost and/or practicality reasons.
扫码关注我们
求助内容:
应助结果提醒方式:
