GALÆXI：用基于加速器的系统上的高阶非连续伽勒金方法解决复杂可压缩流动问题

arXiv - CS - Mathematical Software Pub Date : 2024-04-19 DOI:arxiv-2404.12703

Daniel Kempf, Marius Kurz, Marcel Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, Andrea Beck

{"title":"GALÆXI：用基于加速器的系统上的高阶非连续伽勒金方法解决复杂可压缩流动问题","authors":"Daniel Kempf, Marius Kurz, Marcel Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, Andrea Beck","doi":"arxiv-2404.12703","DOIUrl":null,"url":null,"abstract":"This work presents GAL{\\AE}XI as a novel, energy-efficient flow solver for\nthe simulation of compressible flows on unstructured meshes leveraging the\nparallel computing power of modern Graphics Processing Units (GPUs). GAL{\\AE}XI\nimplements the high-order Discontinuous Galerkin Spectral Element Method\n(DGSEM) using shock capturing with a finite-volume subcell approach to ensure\nthe stability of the high-order scheme near shocks. This work provides details\non the general code design, the parallelization strategy, and the\nimplementation approach for the compute kernels with a focus on the element\nlocal mappings between volume and surface data due to the unstructured mesh.\nGAL{\\AE}XI exhibits excellent strong scaling properties up to 1024 GPUs if each\nGPU is assigned a minimum of one million degrees of freedom degrees of freedom.\nTo verify its implementation, a convergence study is performed that recovers\nthe theoretical order of convergence of the implemented numerical schemes.\nMoreover, the solver is validated using both the incompressible and\ncompressible formulation of the Taylor-Green-Vortex at a Mach number of 0.1 and\n1.25, respectively. A mesh convergence study shows that the results converge to\nthe high-fidelity reference solution and that the results match the original\nCPU implementation. Finally, GAL{\\AE}XI is applied to a large-scale\nwall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37.\nHere, the supersonic region and shocks at the leading edge are captured\naccurately and robustly by the implemented shock-capturing approach. It is\ndemonstrated that GAL{\\AE}XI requires less than half of the energy to carry out\nthis simulation in comparison to the reference CPU implementation. This renders\nGAL{\\AE}XI as a potent tool for accurate and efficient simulations of\ncompressible flows in the realm of exascale computing and the associated new\nHPC architectures.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems\",\"authors\":\"Daniel Kempf, Marius Kurz, Marcel Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, Andrea Beck\",\"doi\":\"arxiv-2404.12703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work presents GAL{\\\\AE}XI as a novel, energy-efficient flow solver for\\nthe simulation of compressible flows on unstructured meshes leveraging the\\nparallel computing power of modern Graphics Processing Units (GPUs). GAL{\\\\AE}XI\\nimplements the high-order Discontinuous Galerkin Spectral Element Method\\n(DGSEM) using shock capturing with a finite-volume subcell approach to ensure\\nthe stability of the high-order scheme near shocks. This work provides details\\non the general code design, the parallelization strategy, and the\\nimplementation approach for the compute kernels with a focus on the element\\nlocal mappings between volume and surface data due to the unstructured mesh.\\nGAL{\\\\AE}XI exhibits excellent strong scaling properties up to 1024 GPUs if each\\nGPU is assigned a minimum of one million degrees of freedom degrees of freedom.\\nTo verify its implementation, a convergence study is performed that recovers\\nthe theoretical order of convergence of the implemented numerical schemes.\\nMoreover, the solver is validated using both the incompressible and\\ncompressible formulation of the Taylor-Green-Vortex at a Mach number of 0.1 and\\n1.25, respectively. A mesh convergence study shows that the results converge to\\nthe high-fidelity reference solution and that the results match the original\\nCPU implementation. Finally, GAL{\\\\AE}XI is applied to a large-scale\\nwall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37.\\nHere, the supersonic region and shocks at the leading edge are captured\\naccurately and robustly by the implemented shock-capturing approach. It is\\ndemonstrated that GAL{\\\\AE}XI requires less than half of the energy to carry out\\nthis simulation in comparison to the reference CPU implementation. This renders\\nGAL{\\\\AE}XI as a potent tool for accurate and efficient simulations of\\ncompressible flows in the realm of exascale computing and the associated new\\nHPC architectures.\",\"PeriodicalId\":501256,\"journal\":{\"name\":\"arXiv - CS - Mathematical Software\",\"volume\":\"4 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Mathematical Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2404.12703\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2404.12703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究利用现代图形处理器（GPU）的并行计算能力，提出了GAL{/AE}XI作为一种新颖、节能的流动求解器，用于模拟非结构网格上的可压缩流动。GAL{\AE}XI 利用冲击捕捉和有限体积子单元方法实现了高阶非连续伽勒金谱元法（DGSEM），以确保高阶方案在冲击附近的稳定性。这项工作详细介绍了计算内核的一般代码设计、并行化策略和实现方法，重点是非结构化网格导致的体积和表面数据之间的元素局部映射。如果为每个 GPU 分配至少一百万自由度，GAL{\AE}XI 将在高达 1024 个 GPU 上表现出卓越的强扩展特性。此外，在马赫数分别为0.1和1.25的条件下，使用泰勒-格林-漩涡的不可压缩和可压缩形式对求解器进行了验证。网格收敛研究表明，结果收敛于高保真参考解，并且结果与最初的CPU实现相匹配。最后，GAL{AE}XI 被应用于 NASA 37 号转子线性级联的大尺度分辨大涡流模拟。在这里，冲击捕获方法准确而稳健地捕获了超音速区域和前缘的冲击。结果表明，与参考的CPU实现相比，GAL{/AE}XI进行仿真所需的能量不到一半。这使得GAL{AE}XI成为在超大规模计算领域和相关的新型高性能计算架构中精确、高效地模拟可压缩流动的有力工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems

This work presents GAL{\AE}XI as a novel, energy-efficient flow solver for the simulation of compressible flows on unstructured meshes leveraging the parallel computing power of modern Graphics Processing Units (GPUs). GAL{\AE}XI implements the high-order Discontinuous Galerkin Spectral Element Method (DGSEM) using shock capturing with a finite-volume subcell approach to ensure the stability of the high-order scheme near shocks. This work provides details on the general code design, the parallelization strategy, and the implementation approach for the compute kernels with a focus on the element local mappings between volume and surface data due to the unstructured mesh. GAL{\AE}XI exhibits excellent strong scaling properties up to 1024 GPUs if each GPU is assigned a minimum of one million degrees of freedom degrees of freedom. To verify its implementation, a convergence study is performed that recovers the theoretical order of convergence of the implemented numerical schemes. Moreover, the solver is validated using both the incompressible and compressible formulation of the Taylor-Green-Vortex at a Mach number of 0.1 and 1.25, respectively. A mesh convergence study shows that the results converge to the high-fidelity reference solution and that the results match the original CPU implementation. Finally, GAL{\AE}XI is applied to a large-scale wall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37. Here, the supersonic region and shocks at the leading edge are captured accurately and robustly by the implemented shock-capturing approach. It is demonstrated that GAL{\AE}XI requires less than half of the energy to carry out this simulation in comparison to the reference CPU implementation. This renders GAL{\AE}XI as a potent tool for accurate and efficient simulations of compressible flows in the realm of exascale computing and the associated new HPC architectures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Mathematical Software

自引率

0.00%

发文量