GPU acceleration of four-way coupled PP-DNS for compressible particle-laden wall turbulence

IF 3.6 2区 工程技术 Q1 MECHANICS International Journal of Multiphase Flow Pub Date : 2024-04-17 DOI:10.1016/j.ijmultiphaseflow.2024.104840
Zi-Mo Liao, Liang-Bing Chen, Zhen-Hua Wan, Nan-Sheng Liu, Xi-Yun Lu
{"title":"GPU acceleration of four-way coupled PP-DNS for compressible particle-laden wall turbulence","authors":"Zi-Mo Liao,&nbsp;Liang-Bing Chen,&nbsp;Zhen-Hua Wan,&nbsp;Nan-Sheng Liu,&nbsp;Xi-Yun Lu","doi":"10.1016/j.ijmultiphaseflow.2024.104840","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents an efficient implementation of the four-way coupled point-particle direct numerical simulation (PP-DNS) for compressible particle-laden wall turbulence, utilizing the open-source finite-difference compressible Navier–Stokes solver, STREAmS. The proposed design integrates a GPU-based two-phase collision detection algorithm known as the spatial subdivision method, along with specialized storage and MPI communication strategies for Lagrangian particles on multi-GPU platforms. Specifically, a ‘page table’ like data structure is designed to store the particle information compactly and to enable highly parallelized packing and unpacking procedures for GPU-GPU data exchange. These advancements significantly reduce the computational cost of four-way coupled particle-laden flow simulations, enabling efficient simulations involving over <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>7</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> particles (an order of magnitude higher than that in the state-of-the-art simulations) on a single NVIDIA A100 GPU. To validate the proposed implementation, we perform simulations of compressible particle-laden wall-bounded turbulence using canonical configurations such as channel flows and zero-pressure-gradient boundary layers. The example results highlight the effects of inter-particle collisions and flow compressibility. Furthermore, we assess single-GPU performance and scalability by employing up to eight NVIDIA GPU devices. Even for four-way coupled simulations, the elapsed time per step scales approximately linearly with the number of particles (when the number of particles is large enough), and a parallel efficiency of 94.1% is achieved on 8 NVIDIA A100 GPUs.</p></div>","PeriodicalId":339,"journal":{"name":"International Journal of Multiphase Flow","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Multiphase Flow","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0301932224001198","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MECHANICS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents an efficient implementation of the four-way coupled point-particle direct numerical simulation (PP-DNS) for compressible particle-laden wall turbulence, utilizing the open-source finite-difference compressible Navier–Stokes solver, STREAmS. The proposed design integrates a GPU-based two-phase collision detection algorithm known as the spatial subdivision method, along with specialized storage and MPI communication strategies for Lagrangian particles on multi-GPU platforms. Specifically, a ‘page table’ like data structure is designed to store the particle information compactly and to enable highly parallelized packing and unpacking procedures for GPU-GPU data exchange. These advancements significantly reduce the computational cost of four-way coupled particle-laden flow simulations, enabling efficient simulations involving over O(107) particles (an order of magnitude higher than that in the state-of-the-art simulations) on a single NVIDIA A100 GPU. To validate the proposed implementation, we perform simulations of compressible particle-laden wall-bounded turbulence using canonical configurations such as channel flows and zero-pressure-gradient boundary layers. The example results highlight the effects of inter-particle collisions and flow compressibility. Furthermore, we assess single-GPU performance and scalability by employing up to eight NVIDIA GPU devices. Even for four-way coupled simulations, the elapsed time per step scales approximately linearly with the number of particles (when the number of particles is large enough), and a parallel efficiency of 94.1% is achieved on 8 NVIDIA A100 GPUs.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用 GPU 加速四向耦合 PP-DNS 的可压缩颗粒壁湍流
本文利用开源有限差分可压缩纳维-斯托克斯求解器 STREAmS,介绍了针对可压缩颗粒满布壁面湍流的四向耦合点-颗粒直接数值模拟(PP-DNS)的高效实现。拟议的设计集成了一种基于 GPU 的两相碰撞检测算法(即空间细分法),以及在多 GPU 平台上针对拉格朗日粒子的专门存储和 MPI 通信策略。具体来说,设计了一种类似于 "页表 "的数据结构来紧凑地存储粒子信息,并为 GPU-GPU 数据交换实现高度并行化的打包和解包程序。这些进步大大降低了四向耦合粒子流模拟的计算成本,使在单个英伟达 A100 GPU 上进行涉及超过 O(107) 个粒子的高效模拟(比最先进的模拟高出一个数量级)成为可能。为了验证所提出的实现方法,我们使用通道流和零压梯度边界层等典型配置对可压缩颗粒壁面湍流进行了模拟。示例结果突出显示了粒子间碰撞和流动可压缩性的影响。此外,我们还通过使用多达八个英伟达™(NVIDIA®)GPU设备评估了单GPU性能和可扩展性。即使是四向耦合模拟,每一步的耗时也与粒子数量(当粒子数量足够大时)呈近似线性关系,在 8 个英伟达 A100 GPU 上实现了 94.1% 的并行效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.30
自引率
10.50%
发文量
244
审稿时长
4 months
期刊介绍: The International Journal of Multiphase Flow publishes analytical, numerical and experimental articles of lasting interest. The scope of the journal includes all aspects of mass, momentum and energy exchange phenomena among different phases such as occur in disperse flows, gas–liquid and liquid–liquid flows, flows in porous media, boiling, granular flows and others. The journal publishes full papers, brief communications and conference announcements.
期刊最新文献
Understanding the boiling characteristics of bi-component droplets with improved bubble nucleation and break-up mechanisms Analysis of the explicit volume diffusion subgrid closure for the Σ−Y model to interfacial flows over a wide range of Weber numbers Experimental study on the water entry of a vehicle with a single canard wing at an initial angle of attack Pattern formation in foam displacement in a liquid-filled Hele-Shaw cell Inertial particles in a turbulent/turbulent interface
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1