Pub Date : 2021-06-01DOI: 10.1109/IPDPSW52791.2021.00085
Ryan J. Marshall, Lakmali Weerasena, A. Skjellum
The multi-objective set covering problem (MOSCP) appears in many different real-world applications. We implemented a meta-solver in C++ that introduces shared-memory concurrency using OpenMP. It incorporates a commonly used Mixed Integer Problem (MIP) solver to find initial solutions with a linear programming (LP) solver that enumerates possible solutions over a tree of subproblems using a local branch approach. Adhering to a finite cutoff value, solutions are ordered as they are passed back up the tree to produce the set of Pareto fronts. In this paper, we present a serial version of the meta-solver with a novel search procedure that outperforms a previous implementation, and when parallelization techniques are applied, a 9-12x speedup is achieved with the possibility of further improvement for large problems.
{"title":"A Parallel Meta-Solver for the Multi-Objective Set Covering Problem","authors":"Ryan J. Marshall, Lakmali Weerasena, A. Skjellum","doi":"10.1109/IPDPSW52791.2021.00085","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00085","url":null,"abstract":"The multi-objective set covering problem (MOSCP) appears in many different real-world applications. We implemented a meta-solver in C++ that introduces shared-memory concurrency using OpenMP. It incorporates a commonly used Mixed Integer Problem (MIP) solver to find initial solutions with a linear programming (LP) solver that enumerates possible solutions over a tree of subproblems using a local branch approach. Adhering to a finite cutoff value, solutions are ordered as they are passed back up the tree to produce the set of Pareto fronts. In this paper, we present a serial version of the meta-solver with a novel search procedure that outperforms a previous implementation, and when parallelization techniques are applied, a 9-12x speedup is achieved with the possibility of further improvement for large problems.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115974200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ipdpsw52791.2021.00150
{"title":"HPS 2021 Invited Speaker-1: The Storage System of the Fugaku Supercomputer","authors":"","doi":"10.1109/ipdpsw52791.2021.00150","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00150","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131992574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ipdpsw52791.2021.00101
{"title":"Introduction to JSSPP 2021","authors":"","doi":"10.1109/ipdpsw52791.2021.00101","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00101","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134053010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ipdpsw52791.2021.00102
{"title":"Message from the PDSEC-21 Workshop Chairs","authors":"","doi":"10.1109/ipdpsw52791.2021.00102","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00102","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126613161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ipdpsw52791.2021.00122
{"title":"SNACS 2021 Keynote: Ultrascale System Interconnects at the end of Moore’s Law","authors":"","doi":"10.1109/ipdpsw52791.2021.00122","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00122","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121691629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/IPDPSW52791.2021.00066
Gregor Daiß, Mikael Simberg, Auriane Reverdell, J. Biddiscombe, Theresa Pollinger, H. Kaiser, D. Pflüger
Between a widening range of GPU vendors and the trend of having more GPUs per compute node in supercomputers such as Summit, Perlmutter, Frontier and Aurora, developing performant yet portable distributed HPC applications becomes ever more challenging. Leveraging existing solutions like Kokkos for platform-independent code and HPX for distributing the application in a task-based fashion can alleviate these challenges. However, using such frameworks in the same application requires them to work together seamlessly. In this work we present an HPX Kokkos integration that works both ways: we can integrate CPU and GPU Kokkos kernels as HPX tasks and inversely use HPX worker threads to work on Kokkos kernels. Using HPX futures makes launching and synchronizing Kokkos kernels from multiple threads easy, allowing us to move away from the more traditional fork-join model. To evaluate our integrations we ported existing Vc and CUDA kernels within an existing HPX application, Octo-Tiger, to use Kokkos instead. We achieve comparable, or better, performance than with previous Vc and CUDA kernels, showing both the viability of our HPX Kokkos integration, as well as future-proofing Octo-Tiger for a wider range of potential machines. Furthermore, we introduce event polling for synchronizing CUDA kernels (or Kokkos kernels on the respective backend) achieving speedups over the previous solution using callbacks.
{"title":"Beyond Fork-Join: Integration of Performance Portable Kokkos Kernels with HPX","authors":"Gregor Daiß, Mikael Simberg, Auriane Reverdell, J. Biddiscombe, Theresa Pollinger, H. Kaiser, D. Pflüger","doi":"10.1109/IPDPSW52791.2021.00066","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00066","url":null,"abstract":"Between a widening range of GPU vendors and the trend of having more GPUs per compute node in supercomputers such as Summit, Perlmutter, Frontier and Aurora, developing performant yet portable distributed HPC applications becomes ever more challenging. Leveraging existing solutions like Kokkos for platform-independent code and HPX for distributing the application in a task-based fashion can alleviate these challenges. However, using such frameworks in the same application requires them to work together seamlessly. In this work we present an HPX Kokkos integration that works both ways: we can integrate CPU and GPU Kokkos kernels as HPX tasks and inversely use HPX worker threads to work on Kokkos kernels. Using HPX futures makes launching and synchronizing Kokkos kernels from multiple threads easy, allowing us to move away from the more traditional fork-join model. To evaluate our integrations we ported existing Vc and CUDA kernels within an existing HPX application, Octo-Tiger, to use Kokkos instead. We achieve comparable, or better, performance than with previous Vc and CUDA kernels, showing both the viability of our HPX Kokkos integration, as well as future-proofing Octo-Tiger for a wider range of potential machines. Furthermore, we introduce event polling for synchronizing CUDA kernels (or Kokkos kernels on the respective backend) achieving speedups over the previous solution using callbacks.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131765130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}