Akihiro Tabuchi, M. Nakao, H. Murai, T. Boku, M. Sato
{"title":"Performance evaluation for a hydrodynamics application in XcalableACC PGAS language for accelerated clusters","authors":"Akihiro Tabuchi, M. Nakao, H. Murai, T. Boku, M. Sato","doi":"10.1145/3176364.3176365","DOIUrl":null,"url":null,"abstract":"Clusters equipped with accelerators such as GPUs and MICs are used widely. To use these clusters, programmers write programs for their applications by combining MPI with one of the accelerator programming models such as CUDA and OpenACC. The accelerator programming component is becoming easier because of a directive-based OpenACC, but complex distributed-memory programming using MPI means that programming is still difficult. In order to simplify the programming process, XcalableACC (XACC) has been proposed as an \"orthogonal\" integration of the PGAS language XcalableMP (XMP) and OpenACC. XACC provides the original XMP and OpenACC features, as well as their extensions for communication between accelerator memories. In this study, we implemented a hydrodynamics mini-application Clover-Leaf in XACC and evaluated the usability of XACC in terms of it performance and productivity. According to the performance evaluation, the XACC version achieved 87--95% of the performance of the MPI+CUDA version and 93--101% of the MPI+OpenACC version with strong scaling, and 88--91% of the MPI+CUDA version and 94--97% of the MPI+OpenACC version with weak scaling. In particular, the halo exchange time was better with XACC than MPI+OpenACC in some cases because the Omni XACC runtime is written in MPI and CUDA, and it is well tuned. The productivity evaluation showed that the application could be implemented after small changes compared with the serial version. These results demonstrate that XACC is a practical programming language for science applications.","PeriodicalId":371083,"journal":{"name":"Proceedings of Workshops of HPC Asia","volume":"561 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of Workshops of HPC Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3176364.3176365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Clusters equipped with accelerators such as GPUs and MICs are used widely. To use these clusters, programmers write programs for their applications by combining MPI with one of the accelerator programming models such as CUDA and OpenACC. The accelerator programming component is becoming easier because of a directive-based OpenACC, but complex distributed-memory programming using MPI means that programming is still difficult. In order to simplify the programming process, XcalableACC (XACC) has been proposed as an "orthogonal" integration of the PGAS language XcalableMP (XMP) and OpenACC. XACC provides the original XMP and OpenACC features, as well as their extensions for communication between accelerator memories. In this study, we implemented a hydrodynamics mini-application Clover-Leaf in XACC and evaluated the usability of XACC in terms of it performance and productivity. According to the performance evaluation, the XACC version achieved 87--95% of the performance of the MPI+CUDA version and 93--101% of the MPI+OpenACC version with strong scaling, and 88--91% of the MPI+CUDA version and 94--97% of the MPI+OpenACC version with weak scaling. In particular, the halo exchange time was better with XACC than MPI+OpenACC in some cases because the Omni XACC runtime is written in MPI and CUDA, and it is well tuned. The productivity evaluation showed that the application could be implemented after small changes compared with the serial version. These results demonstrate that XACC is a practical programming language for science applications.