{"title":"A multi-GPU based high-performance computing framework in elastodynamics simulation using octree meshes","authors":"Shayan Mohammadian, Ankit S. Kumar, Chongmin Song","doi":"10.1016/j.cma.2024.117723","DOIUrl":null,"url":null,"abstract":"<div><div>This paper proposes a high-performance computing framework for large-scale elastodynamic analysis utilizing Graphics Processor Units (GPUs). The study adopts an octree algorithm for automatic mesh generation. The scaled boundary finite element method (SBFEM) is employed with the octree mesh, eliminating hanging nodes between octree cells with different sizes. This approach significantly reduces the computational cost and memory requirement by exploiting the limited number of master cells in a balanced octree grid, and is advantageous for GPU computation. The parallelization is achieved through mesh-partitioning techniques and message-passing-interface (MPI) directives, complemented by the NVIDIA Collective Communication Library (NCCL) for optimal point-to-point communication between GPUs in high-performance computing (HPC) facilities. The HPC framework is implemented for both explicit and implicit dynamic analysis. The preconditioned conjugate gradient method is employed for the equation solution in the implicit analysis. Numerical examples are presented for validation of the implementation and for demonstrating the capabilities of the GPU implementation. An image-based 3D model representing a portion of the Moon’s complex surface is simulated with a layered structure comprising of approximately 440 million degrees of freedom. Using the explicit solver, a speed-up of 865 is achieved on a single computational node equipped with eight NVIDIA A100 GPUs in parallel. A 3D virtual city comprising of approximately 61 million degrees of freedom is modelled using the implicit solver.</div></div>","PeriodicalId":55222,"journal":{"name":"Computer Methods in Applied Mechanics and Engineering","volume":"436 ","pages":"Article 117723"},"PeriodicalIF":6.9000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Applied Mechanics and Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045782524009794","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a high-performance computing framework for large-scale elastodynamic analysis utilizing Graphics Processor Units (GPUs). The study adopts an octree algorithm for automatic mesh generation. The scaled boundary finite element method (SBFEM) is employed with the octree mesh, eliminating hanging nodes between octree cells with different sizes. This approach significantly reduces the computational cost and memory requirement by exploiting the limited number of master cells in a balanced octree grid, and is advantageous for GPU computation. The parallelization is achieved through mesh-partitioning techniques and message-passing-interface (MPI) directives, complemented by the NVIDIA Collective Communication Library (NCCL) for optimal point-to-point communication between GPUs in high-performance computing (HPC) facilities. The HPC framework is implemented for both explicit and implicit dynamic analysis. The preconditioned conjugate gradient method is employed for the equation solution in the implicit analysis. Numerical examples are presented for validation of the implementation and for demonstrating the capabilities of the GPU implementation. An image-based 3D model representing a portion of the Moon’s complex surface is simulated with a layered structure comprising of approximately 440 million degrees of freedom. Using the explicit solver, a speed-up of 865 is achieved on a single computational node equipped with eight NVIDIA A100 GPUs in parallel. A 3D virtual city comprising of approximately 61 million degrees of freedom is modelled using the implicit solver.
期刊介绍:
Computer Methods in Applied Mechanics and Engineering stands as a cornerstone in the realm of computational science and engineering. With a history spanning over five decades, the journal has been a key platform for disseminating papers on advanced mathematical modeling and numerical solutions. Interdisciplinary in nature, these contributions encompass mechanics, mathematics, computer science, and various scientific disciplines. The journal welcomes a broad range of computational methods addressing the simulation, analysis, and design of complex physical problems, making it a vital resource for researchers in the field.