{"title":"Speeding up the communications on a cluster using MPI by means of Software Defined Networks","authors":"","doi":"10.1016/j.future.2024.07.047","DOIUrl":null,"url":null,"abstract":"<div><p>The Open MPI library is widely employed for implementing the message-passing programming model on parallel applications running on distributed memory computer systems, such as large data centers. These applications aim to utilize the highest amount of resources required by High Performance Computing (HPC). The interconnection network is an essential part of the HPC environment, as processes on parallel applications are constantly communicating and sharing data. Software Defined Networking (SDN) is a different networking approach that separates the control plane from the data forwarding plane, which can be configured depending on the network status or specific requirements of parallel application communications. Given that the communication time significantly contributes to the overall execution time of a parallel program and considering the elapsed time during Open MPI initialization of TCP connections between processes in Ethernet networks, this paper proposes the integration of a software defined networking environment into the Open MPI library. The primary objective of our contribution is to provide the network controller with information about Open MPI processes, in order to configure the network during the initialization procedure of the Open MPI library. This may facilitate the development of SDN-based routing techniques that reduce communication times, and thus execution times, using application information, such as the Open MPI endpoints participating in a parallel program execution. To demonstrate the utility of the information provided by Open MPI processes, we have implemented a routing algorithm that will calculate the optimal paths between processes based on the weighted Dijkstra algorithm, using the number of flows traversing the topology links. The evaluation of the proposed mechanism utilizing a 2-stage fat tree topology and two parallel applications - a matrix product and the Model for Prediction Across Scales (MPAS) - showed significant improvements in execution time, with reductions of up to 2.5 times for a 4096 × 4096 matrix product and 1.3 times for an 8192 × 8192 matrix product, as well as a 1.5 times reduction for MPAS in the worst network occupancy scenario. This demonstrates the improvements in communication and therefore execution time.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24004217/pdfft?md5=879ad982dcda72cf4341e57ad5bcfe85&pid=1-s2.0-S0167739X24004217-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004217","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The Open MPI library is widely employed for implementing the message-passing programming model on parallel applications running on distributed memory computer systems, such as large data centers. These applications aim to utilize the highest amount of resources required by High Performance Computing (HPC). The interconnection network is an essential part of the HPC environment, as processes on parallel applications are constantly communicating and sharing data. Software Defined Networking (SDN) is a different networking approach that separates the control plane from the data forwarding plane, which can be configured depending on the network status or specific requirements of parallel application communications. Given that the communication time significantly contributes to the overall execution time of a parallel program and considering the elapsed time during Open MPI initialization of TCP connections between processes in Ethernet networks, this paper proposes the integration of a software defined networking environment into the Open MPI library. The primary objective of our contribution is to provide the network controller with information about Open MPI processes, in order to configure the network during the initialization procedure of the Open MPI library. This may facilitate the development of SDN-based routing techniques that reduce communication times, and thus execution times, using application information, such as the Open MPI endpoints participating in a parallel program execution. To demonstrate the utility of the information provided by Open MPI processes, we have implemented a routing algorithm that will calculate the optimal paths between processes based on the weighted Dijkstra algorithm, using the number of flows traversing the topology links. The evaluation of the proposed mechanism utilizing a 2-stage fat tree topology and two parallel applications - a matrix product and the Model for Prediction Across Scales (MPAS) - showed significant improvements in execution time, with reductions of up to 2.5 times for a 4096 × 4096 matrix product and 1.3 times for an 8192 × 8192 matrix product, as well as a 1.5 times reduction for MPAS in the worst network occupancy scenario. This demonstrates the improvements in communication and therefore execution time.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.