Younghyun Cho, J. Demmel, Jacob King, X. Li, Yang Liu, Hengrui Luo
{"title":"利用人群自动调优高性能计算应用程序","authors":"Younghyun Cho, J. Demmel, Jacob King, X. Li, Yang Liu, Hengrui Luo","doi":"10.1109/IPDPS54959.2023.00069","DOIUrl":null,"url":null,"abstract":"This paper presents GPTuneCrowd, a crowd-based autotuning framework for tuning high-performance computing applications. GPTuneCrowd collects performance data from various users using a user-friendly tuner interface. GPTuneCrowd then presents novel autotuning techniques, based on transfer learning and parameter sensitivity analysis, to maximize tuning quality using collected data from the crowd. This paper shows several real-world case studies of GPTuneCrowd. Our evaluation shows that GPTuneCrowd’s transfer learning improves the tuned performance of ScaLAPACK’s PDGEQRF by 1.57x and a plasma fusion code NIMROD by 2.97x, over a non-transfer learning autotuner. We use GPTuneCrowd’s sensitivity analysis to reduce the search space of SuperLU_DIST and Hypre. Tuning on the reduced search space achieves 1.17x and 1.35x better tuned performance of SuperLU_DIST and Hypre, respectively, compared to the original search space.","PeriodicalId":343684,"journal":{"name":"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Harnessing the Crowd for Autotuning High-Performance Computing Applications\",\"authors\":\"Younghyun Cho, J. Demmel, Jacob King, X. Li, Yang Liu, Hengrui Luo\",\"doi\":\"10.1109/IPDPS54959.2023.00069\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents GPTuneCrowd, a crowd-based autotuning framework for tuning high-performance computing applications. GPTuneCrowd collects performance data from various users using a user-friendly tuner interface. GPTuneCrowd then presents novel autotuning techniques, based on transfer learning and parameter sensitivity analysis, to maximize tuning quality using collected data from the crowd. This paper shows several real-world case studies of GPTuneCrowd. Our evaluation shows that GPTuneCrowd’s transfer learning improves the tuned performance of ScaLAPACK’s PDGEQRF by 1.57x and a plasma fusion code NIMROD by 2.97x, over a non-transfer learning autotuner. We use GPTuneCrowd’s sensitivity analysis to reduce the search space of SuperLU_DIST and Hypre. Tuning on the reduced search space achieves 1.17x and 1.35x better tuned performance of SuperLU_DIST and Hypre, respectively, compared to the original search space.\",\"PeriodicalId\":343684,\"journal\":{\"name\":\"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS54959.2023.00069\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS54959.2023.00069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Harnessing the Crowd for Autotuning High-Performance Computing Applications
This paper presents GPTuneCrowd, a crowd-based autotuning framework for tuning high-performance computing applications. GPTuneCrowd collects performance data from various users using a user-friendly tuner interface. GPTuneCrowd then presents novel autotuning techniques, based on transfer learning and parameter sensitivity analysis, to maximize tuning quality using collected data from the crowd. This paper shows several real-world case studies of GPTuneCrowd. Our evaluation shows that GPTuneCrowd’s transfer learning improves the tuned performance of ScaLAPACK’s PDGEQRF by 1.57x and a plasma fusion code NIMROD by 2.97x, over a non-transfer learning autotuner. We use GPTuneCrowd’s sensitivity analysis to reduce the search space of SuperLU_DIST and Hypre. Tuning on the reduced search space achieves 1.17x and 1.35x better tuned performance of SuperLU_DIST and Hypre, respectively, compared to the original search space.