{"title":"An efficient parallelized discrete particle model for dense gas-solid flows on unstructured mesh","authors":"C. L. Wu, K. Nandakumar","doi":"10.1145/2016741.2016752","DOIUrl":null,"url":null,"abstract":"An efficient, parallelized implementation of discrete particle/element model (DPM or DEM) coupled with the computational fluid dynamics (CFD) model has been developed. Two parallelization strategies are used to partly overcome the poor load balancing problem due to the heterogeneous particle distribution in space. Firstly at the coarse-grained level, the solution domain is decomposed into partitions using bisection algorithm to minimize the number of faces at the partition boundaries while keeping almost equal number of cells in each partition. The solution of the gas-phase governing equations is performed on these partitions. Particles and the solution of their dynamics are associated with partitions according to their hosting cells. This makes no data exchange between processors when calculating the hydrodynamic forces on particles. By introducing proper data mapping between partitions, the cell void fraction is calculated accurately even if a particle is shared by several partitions. Neighboring partitions are grouped by a gross evaluation before simulation, with each group having similar particle number. The computation task of a group of partitions is assigned to a compute node, which has multi-cores or multiprocessors with a shared memory. Each core or processor in a node takes the computation of the gas governing equations in one partition. Processors communicate and exchange data through Message Passing Interface (MPI) at this coarse-grained parallelism. Secondly, the multithreading technique is used to parallelize the computation of the dynamics of the particles in each partition. The number of compute threads is determined according to the number of particles in partitions and the number of cores in a compute node. In such a way there is almost no waiting of the threads in a compute node. Since the particle numbers in all compute nodes are almost the same, the above strategy yields an efficient load balancing among compute nodes. Test numerical experiments on TeraGrid HPC cluster Queen Bee show that the developed code is efficient and scalable to simulate dense gas-solid flows with up to more than 10 millions of particles by 128 compute nodes. Bubbling in a middle-scale fluidized bed and granular Rayleigh-Taylor instability are well captured by the parallel code.","PeriodicalId":257555,"journal":{"name":"TeraGrid Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TeraGrid Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2016741.2016752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
An efficient, parallelized implementation of discrete particle/element model (DPM or DEM) coupled with the computational fluid dynamics (CFD) model has been developed. Two parallelization strategies are used to partly overcome the poor load balancing problem due to the heterogeneous particle distribution in space. Firstly at the coarse-grained level, the solution domain is decomposed into partitions using bisection algorithm to minimize the number of faces at the partition boundaries while keeping almost equal number of cells in each partition. The solution of the gas-phase governing equations is performed on these partitions. Particles and the solution of their dynamics are associated with partitions according to their hosting cells. This makes no data exchange between processors when calculating the hydrodynamic forces on particles. By introducing proper data mapping between partitions, the cell void fraction is calculated accurately even if a particle is shared by several partitions. Neighboring partitions are grouped by a gross evaluation before simulation, with each group having similar particle number. The computation task of a group of partitions is assigned to a compute node, which has multi-cores or multiprocessors with a shared memory. Each core or processor in a node takes the computation of the gas governing equations in one partition. Processors communicate and exchange data through Message Passing Interface (MPI) at this coarse-grained parallelism. Secondly, the multithreading technique is used to parallelize the computation of the dynamics of the particles in each partition. The number of compute threads is determined according to the number of particles in partitions and the number of cores in a compute node. In such a way there is almost no waiting of the threads in a compute node. Since the particle numbers in all compute nodes are almost the same, the above strategy yields an efficient load balancing among compute nodes. Test numerical experiments on TeraGrid HPC cluster Queen Bee show that the developed code is efficient and scalable to simulate dense gas-solid flows with up to more than 10 millions of particles by 128 compute nodes. Bubbling in a middle-scale fluidized bed and granular Rayleigh-Taylor instability are well captured by the parallel code.