{"title":"Critical-path candidates: scalable performance modeling for MPI workloads","authors":"Jian Chen, R. Clapp","doi":"10.1109/ISPASS.2015.7095779","DOIUrl":null,"url":null,"abstract":"Efficient and scalable performance modeling is essential to high-performance cluster computing. The critical path based performance analysis has been widely used as it provides valuable insights into the performance of parallel programs, but it is also expensive, inefficient, and inflexible due to its strong reliance on trace-driven simulation. This paper presents an innovative performance modeling framework based on a novel concept of critical-path candidates. The critical-path candidates refer to a group of paths that could potentially be the critical path. Using the instruction and communication counts as the metrics, the critical-path candidate captures the intrinsic computation and communication dependencies, and hence can be reused for exploring multiple design options. Using real-world MPI workloads, we show that the proposed framework achieves a modeling accuracy within 10% compared with the measured runtime for up to 16K MPI ranks. This framework provides an efficient and scalable platform for performance analysis as well as load imbalance analysis.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Efficient and scalable performance modeling is essential to high-performance cluster computing. The critical path based performance analysis has been widely used as it provides valuable insights into the performance of parallel programs, but it is also expensive, inefficient, and inflexible due to its strong reliance on trace-driven simulation. This paper presents an innovative performance modeling framework based on a novel concept of critical-path candidates. The critical-path candidates refer to a group of paths that could potentially be the critical path. Using the instruction and communication counts as the metrics, the critical-path candidate captures the intrinsic computation and communication dependencies, and hence can be reused for exploring multiple design options. Using real-world MPI workloads, we show that the proposed framework achieves a modeling accuracy within 10% compared with the measured runtime for up to 16K MPI ranks. This framework provides an efficient and scalable platform for performance analysis as well as load imbalance analysis.