{"title":"萨满","authors":"Sophie Robert, S. Zertal, G. Goret","doi":"10.1145/3419604.3419775","DOIUrl":null,"url":null,"abstract":"Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"142 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"SHAMan\",\"authors\":\"Sophie Robert, S. Zertal, G. Goret\",\"doi\":\"10.1145/3419604.3419775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.\",\"PeriodicalId\":250715,\"journal\":{\"name\":\"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications\",\"volume\":\"142 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3419604.3419775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3419604.3419775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.