Parallel implementation and performance of super-resolution generative adversarial network turbulence models for large-eddy simulation

IF 3 3区 工程技术 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Fluids Pub Date : 2025-02-15 Epub Date: 2024-12-02 DOI:10.1016/j.compfluid.2024.106498
Ludovico Nista , Christoph D.K. Schumann , Peicho Petkov , Valentin Pavlov , Temistocle Grenga , Jonathan F. MacArt , Antonio Attili , Stoyan Markov , Heinz Pitsch
{"title":"Parallel implementation and performance of super-resolution generative adversarial network turbulence models for large-eddy simulation","authors":"Ludovico Nista ,&nbsp;Christoph D.K. Schumann ,&nbsp;Peicho Petkov ,&nbsp;Valentin Pavlov ,&nbsp;Temistocle Grenga ,&nbsp;Jonathan F. MacArt ,&nbsp;Antonio Attili ,&nbsp;Stoyan Markov ,&nbsp;Heinz Pitsch","doi":"10.1016/j.compfluid.2024.106498","DOIUrl":null,"url":null,"abstract":"<div><div>Super-resolution (SR) generative adversarial networks (GANs) are promising for turbulence closure in large-eddy simulation (LES) due to their ability to accurately reconstruct high-resolution data from low-resolution fields. Current model training and inference strategies are not sufficiently mature for large-scale, distributed calculations due to the computational demands and often unstable training of SR-GANs, which limits the exploration of improved model structures, training strategies, and loss-function definitions. Integrating SR-GANs into LES solvers for inference-coupled simulations is also necessary to assess their <em>a posteriori</em> accuracy, stability, and cost. We investigate parallelization strategies for SR-GAN training and inference-coupled LES, focusing on computational performance and reconstruction accuracy. We examine distributed data-parallel training strategies for hybrid CPU–GPU node architectures and the associated influence of low-/high-resolution subbox size, global batch size, and discriminator accuracy. Accurate predictions require training subboxes that are sufficiently large relative to the Kolmogorov length scale. Care should be placed on the coupled effect of training batch size, learning rate, number of training subboxes, and discriminator’s learning capabilities. We introduce a data-parallel SR-GAN training and inference library for heterogeneous architectures that enables exchange between the LES solver and SR-GAN inference at runtime. We investigate the predictive accuracy and computational performance of this arrangement with particular focus on the overlap (halo) size required for accurate SR reconstruction. Similarly, <em>a posteriori</em> parallel scaling for efficient inference-coupled LES is constrained by the SR subdomain size, GPU utilization, and reconstruction accuracy. Based on these findings, we establish guidelines and best practices to optimize resource utilization and parallel acceleration of SR-GAN turbulence model training and inference-coupled LES calculations while maintaining predictive accuracy.</div></div>","PeriodicalId":287,"journal":{"name":"Computers & Fluids","volume":"288 ","pages":"Article 106498"},"PeriodicalIF":3.0000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Fluids","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045793024003293","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/2 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Super-resolution (SR) generative adversarial networks (GANs) are promising for turbulence closure in large-eddy simulation (LES) due to their ability to accurately reconstruct high-resolution data from low-resolution fields. Current model training and inference strategies are not sufficiently mature for large-scale, distributed calculations due to the computational demands and often unstable training of SR-GANs, which limits the exploration of improved model structures, training strategies, and loss-function definitions. Integrating SR-GANs into LES solvers for inference-coupled simulations is also necessary to assess their a posteriori accuracy, stability, and cost. We investigate parallelization strategies for SR-GAN training and inference-coupled LES, focusing on computational performance and reconstruction accuracy. We examine distributed data-parallel training strategies for hybrid CPU–GPU node architectures and the associated influence of low-/high-resolution subbox size, global batch size, and discriminator accuracy. Accurate predictions require training subboxes that are sufficiently large relative to the Kolmogorov length scale. Care should be placed on the coupled effect of training batch size, learning rate, number of training subboxes, and discriminator’s learning capabilities. We introduce a data-parallel SR-GAN training and inference library for heterogeneous architectures that enables exchange between the LES solver and SR-GAN inference at runtime. We investigate the predictive accuracy and computational performance of this arrangement with particular focus on the overlap (halo) size required for accurate SR reconstruction. Similarly, a posteriori parallel scaling for efficient inference-coupled LES is constrained by the SR subdomain size, GPU utilization, and reconstruction accuracy. Based on these findings, we establish guidelines and best practices to optimize resource utilization and parallel acceleration of SR-GAN turbulence model training and inference-coupled LES calculations while maintaining predictive accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大涡模拟超分辨率生成对抗网络湍流模型的并行实现与性能研究
超分辨率(SR)生成对抗网络(gan)由于能够从低分辨率场中精确重建高分辨率数据,因此有望用于大涡模拟(LES)中的湍流关闭。由于sr - gan的计算需求和经常不稳定的训练,目前的模型训练和推理策略对于大规模、分布式计算来说还不够成熟,这限制了对改进模型结构、训练策略和损失函数定义的探索。将sr - gan集成到LES求解器中进行推理耦合模拟,也有必要评估它们的后验精度、稳定性和成本。我们研究了SR-GAN训练和推理耦合LES的并行化策略,重点关注计算性能和重建精度。我们研究了混合CPU-GPU节点架构的分布式数据并行训练策略,以及低/高分辨率子盒大小、全局批大小和鉴别器精度的相关影响。准确的预测需要训练子盒相对于柯尔莫哥洛夫长度尺度足够大。应该注意训练批大小、学习率、训练子盒数量和判别器学习能力的耦合效应。我们为异构架构引入了一个数据并行的SR-GAN训练和推理库,该库允许在运行时在LES求解器和SR-GAN推理之间进行交换。我们研究了这种排列的预测精度和计算性能,特别关注精确SR重建所需的重叠(光晕)大小。同样,高效推理耦合LES的后验并行缩放受到SR子域大小、GPU利用率和重建精度的限制。基于这些发现,我们建立了指导方针和最佳实践,以优化资源利用和并行加速SR-GAN湍流模型训练和推理耦合LES计算,同时保持预测准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers & Fluids
Computers & Fluids 物理-计算机:跨学科应用
CiteScore
5.30
自引率
7.10%
发文量
242
审稿时长
10.8 months
期刊介绍: Computers & Fluids is multidisciplinary. The term ''fluid'' is interpreted in the broadest sense. Hydro- and aerodynamics, high-speed and physical gas dynamics, turbulence and flow stability, multiphase flow, rheology, tribology and fluid-structure interaction are all of interest, provided that computer technique plays a significant role in the associated studies or design methodology.
期刊最新文献
Energy-conserving neural network closure model for long-time accurate and stable 2D LES A fast iterative moment method for near-continuum gas flows On general and complete multidimensional Riemann solvers for nonlinear systems of hyperbolic conservation laws Lattice Boltzmann schemes on Cartesian lattices for slow microflows Comparing implicit time-stepping techniques for highly resolved simulations using industrial geometries
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1