Shunpei Sugawara, Yoichi Shimomura, Ryusuke Egawa, H. Takizawa
{"title":"Portability of Vectorization-aware Performance Tuning Expertise across System Generations","authors":"Shunpei Sugawara, Yoichi Shimomura, Ryusuke Egawa, H. Takizawa","doi":"10.1109/MCSoC51149.2021.00043","DOIUrl":null,"url":null,"abstract":"Even HPC expert programmers need to invest considerable time and effort in empirically establishing effective performance tuning strategies for their target systems. When the target system is changed and/or updated, it is thus preferable for expert programmers if their performance tuning expertise can be ported to the new system as much as possible. In this paper, we focus on multiple generations of NEC SX series vector systems. We have documented the performance tuning expertise for the previous generations and built a machine-usable database of performance tuning cases. Therefore, this paper investigates how much the recorded expertise in the database can contribute to performance tuning for the latest generation, NEC SX-Aurora TSUBASA (SX-AT). Since the system architecture as well as the software stack such as compilers are totally renewed for SX-AT, this paper discusses the differences in performance tuning across system generations. In addition, this paper also discusses how to express performance tuning techniques in a machine-usable way. The case study in this paper indicates that the Xevolver's approach of using user-defined code transformations can express most of the vectorization-aware performance tuning techniques, and is thus promising for recording the performance tuning expertise in a future-proof fashion.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"46 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC51149.2021.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Even HPC expert programmers need to invest considerable time and effort in empirically establishing effective performance tuning strategies for their target systems. When the target system is changed and/or updated, it is thus preferable for expert programmers if their performance tuning expertise can be ported to the new system as much as possible. In this paper, we focus on multiple generations of NEC SX series vector systems. We have documented the performance tuning expertise for the previous generations and built a machine-usable database of performance tuning cases. Therefore, this paper investigates how much the recorded expertise in the database can contribute to performance tuning for the latest generation, NEC SX-Aurora TSUBASA (SX-AT). Since the system architecture as well as the software stack such as compilers are totally renewed for SX-AT, this paper discusses the differences in performance tuning across system generations. In addition, this paper also discusses how to express performance tuning techniques in a machine-usable way. The case study in this paper indicates that the Xevolver's approach of using user-defined code transformations can express most of the vectorization-aware performance tuning techniques, and is thus promising for recording the performance tuning expertise in a future-proof fashion.