多gpu架构上SPH仿真的高级负载平衡

2017 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2017-09-01 DOI:10.1109/HPEC.2017.8091093

Kevin Verma, K. Szewc, R. Wille

{"title":"多gpu架构上SPH仿真的高级负载平衡","authors":"Kevin Verma, K. Szewc, R. Wille","doi":"10.1109/HPEC.2017.8091093","DOIUrl":null,"url":null,"abstract":"Smoothed Particle Hydrodynamics (SPH) is a numerical method for fluid flow modeling, in which the fluid is discretized by a set of particles. SPH allows to model complex scenarios, which are difficult or costly to measure in the real world. This method has several advantages compared to other approaches, but suffers from a huge numerical complexity. In order to simulate real life phenomena, up to several hundred millions of particles have to be considered. Hence, HPC methods need to be leveraged to make SPH applicable for industrial applications. Distributing the respective computations among different GPUs to exploit massive parallelism is thereby particularly suited. However, certain characteristics of SPH make it a non-trivial task to properly distribute the respective workload. In this work, we present a load balancing method for a CUDA-based industrial SPH implementation on multi-GPU architectures. To that end, dedicated memory handling schemes are introduced, which reduce the synchronization overhead. Experimental evaluations confirm the scalability and efficiency of the proposed methods.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Advanced load balancing for SPH simulations on multi-GPU architectures\",\"authors\":\"Kevin Verma, K. Szewc, R. Wille\",\"doi\":\"10.1109/HPEC.2017.8091093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Smoothed Particle Hydrodynamics (SPH) is a numerical method for fluid flow modeling, in which the fluid is discretized by a set of particles. SPH allows to model complex scenarios, which are difficult or costly to measure in the real world. This method has several advantages compared to other approaches, but suffers from a huge numerical complexity. In order to simulate real life phenomena, up to several hundred millions of particles have to be considered. Hence, HPC methods need to be leveraged to make SPH applicable for industrial applications. Distributing the respective computations among different GPUs to exploit massive parallelism is thereby particularly suited. However, certain characteristics of SPH make it a non-trivial task to properly distribute the respective workload. In this work, we present a load balancing method for a CUDA-based industrial SPH implementation on multi-GPU architectures. To that end, dedicated memory handling schemes are introduced, which reduce the synchronization overhead. Experimental evaluations confirm the scalability and efficiency of the proposed methods.\",\"PeriodicalId\":364903,\"journal\":{\"name\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC.2017.8091093\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2017.8091093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

光滑颗粒流体动力学(SPH)是一种流体流动建模的数值方法，它将流体离散化为一组颗粒。SPH允许对复杂的场景进行建模，这些场景在现实世界中很难测量或成本很高。与其他方法相比，该方法有几个优点，但其数值复杂性较大。为了模拟现实生活中的现象，必须考虑多达数亿个粒子。因此，需要利用HPC方法使SPH适用于工业应用。因此，在不同的gpu之间分配各自的计算以利用大规模并行性是特别合适的。但是，SPH的某些特性使得正确分配各自的工作负载成为一项非常重要的任务。在这项工作中，我们提出了一种基于cuda的工业SPH在多gpu架构上实现的负载平衡方法。为此，引入了专用的内存处理方案，以减少同步开销。实验验证了该方法的可扩展性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Advanced load balancing for SPH simulations on multi-GPU architectures

Smoothed Particle Hydrodynamics (SPH) is a numerical method for fluid flow modeling, in which the fluid is discretized by a set of particles. SPH allows to model complex scenarios, which are difficult or costly to measure in the real world. This method has several advantages compared to other approaches, but suffers from a huge numerical complexity. In order to simulate real life phenomena, up to several hundred millions of particles have to be considered. Hence, HPC methods need to be leveraged to make SPH applicable for industrial applications. Distributing the respective computations among different GPUs to exploit massive parallelism is thereby particularly suited. However, certain characteristics of SPH make it a non-trivial task to properly distribute the respective workload. In this work, we present a load balancing method for a CUDA-based industrial SPH implementation on multi-GPU architectures. To that end, dedicated memory handling schemes are introduced, which reduce the synchronization overhead. Experimental evaluations confirm the scalability and efficiency of the proposed methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE High Performance Extreme Computing Conference (HPEC)

自引率

0.00%

发文量

期刊最新文献

Optimized task graph mapping on a many-core neuromorphic supercomputer Software-defined extreme scale networks for bigdata applications Power-aware computing: Measurement, control, and performance analysis for Intel Xeon Phi xDCI, a data science cyberinfrastructure for interdisciplinary research Leakage energy reduction for hard real-time caches