Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, N. Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Wonchan Lee, Qinglei Cao, G. Bosilca, S. Mirchandaney, Sean Treichler, P. McCormick, A. Aiken
{"title":"任务工作台:用于评估并行运行时性能的参数化基准","authors":"Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, N. Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Wonchan Lee, Qinglei Cao, G. Bosilca, S. Mirchandaney, Sean Treichler, P. McCormick, A. Aiken","doi":"10.1109/SC41405.2020.00066","DOIUrl":null,"url":null,"abstract":"We present Task Bench, a parameterized benchmark designed to explore the performance of distributed programming systems under a variety of application scenarios. Task Bench dramatically lowers the barrier to benchmarking and comparing multiple programming systems by making the implementation for a given system orthogonal to the benchmarks themselves: every benchmark constructed with Task Bench runs on every Task Bench implementation. Furthermore, Task Bench’s parameterization enables a wide variety of benchmark scenarios that distill the key characteristics of larger applications.To assess the effectiveness and overheads of the tested systems, we introduce a novel metric, minimum effective task granularity (METG). We conduct a comprehensive study with 15 programming systems on up to 256 Haswell nodes of the Cori supercomputer. Running at scale, 100$\\mu$s-long tasks are the finest granularity that any system runs efficiently with current technologies. We also study each system’s scalability, ability to hide communication and mitigate load imbalance.","PeriodicalId":424429,"journal":{"name":"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance\",\"authors\":\"Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, N. Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Wonchan Lee, Qinglei Cao, G. Bosilca, S. Mirchandaney, Sean Treichler, P. McCormick, A. Aiken\",\"doi\":\"10.1109/SC41405.2020.00066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present Task Bench, a parameterized benchmark designed to explore the performance of distributed programming systems under a variety of application scenarios. Task Bench dramatically lowers the barrier to benchmarking and comparing multiple programming systems by making the implementation for a given system orthogonal to the benchmarks themselves: every benchmark constructed with Task Bench runs on every Task Bench implementation. Furthermore, Task Bench’s parameterization enables a wide variety of benchmark scenarios that distill the key characteristics of larger applications.To assess the effectiveness and overheads of the tested systems, we introduce a novel metric, minimum effective task granularity (METG). We conduct a comprehensive study with 15 programming systems on up to 256 Haswell nodes of the Cori supercomputer. Running at scale, 100$\\\\mu$s-long tasks are the finest granularity that any system runs efficiently with current technologies. We also study each system’s scalability, ability to hide communication and mitigate load imbalance.\",\"PeriodicalId\":424429,\"journal\":{\"name\":\"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"73 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC41405.2020.00066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC41405.2020.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance
We present Task Bench, a parameterized benchmark designed to explore the performance of distributed programming systems under a variety of application scenarios. Task Bench dramatically lowers the barrier to benchmarking and comparing multiple programming systems by making the implementation for a given system orthogonal to the benchmarks themselves: every benchmark constructed with Task Bench runs on every Task Bench implementation. Furthermore, Task Bench’s parameterization enables a wide variety of benchmark scenarios that distill the key characteristics of larger applications.To assess the effectiveness and overheads of the tested systems, we introduce a novel metric, minimum effective task granularity (METG). We conduct a comprehensive study with 15 programming systems on up to 256 Haswell nodes of the Cori supercomputer. Running at scale, 100$\mu$s-long tasks are the finest granularity that any system runs efficiently with current technologies. We also study each system’s scalability, ability to hide communication and mitigate load imbalance.