{"title":"分析高速网络中规划模型对高效通信重叠的影响","authors":"G. Utrera, Marisa Gil, X. Martorell","doi":"10.1109/HPCSim.2014.6903689","DOIUrl":null,"url":null,"abstract":"Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 1","pages":"218-225"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Analyzing the impact of programming models for efficient communication overlap in high-speed networks\",\"authors\":\"G. Utrera, Marisa Gil, X. Martorell\",\"doi\":\"10.1109/HPCSim.2014.6903689\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.\",\"PeriodicalId\":6469,\"journal\":{\"name\":\"2014 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"6 1\",\"pages\":\"218-225\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSim.2014.6903689\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2014.6903689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analyzing the impact of programming models for efficient communication overlap in high-speed networks
Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.