{"title":"A heterogeneous hybrid-precision finite volume method for compressible flow on unstructured grids","authors":"Chen Wang, Jian Xia, Long Chen","doi":"10.1016/j.compfluid.2024.106505","DOIUrl":null,"url":null,"abstract":"<div><div>Single-precision floating-point GPU calculations in modern high-performance heterogeneous computing systems are crucial for increasing the efficiency of large-scale fluid simulations on unstructured grids. However, the lack of a unified programming language for heterogeneous systems and the significant computational errors of single-precision calculations in complex problems pose major challenges. Issues such as poor data locality and data contention in unstructured grid CFD calculations limit GPU performance. Through heterogeneous Kokkos computation, we improved data locality through data reordering and addressed data contention using the scatter-reduce strategy, atomic operations, and the color approach. We introduced an innovative hybrid-precision CFD computation strategy that leverages methods based on object distance and grid geometry for precision blending. This approach harnesses the computational advantages of single-precision GPU calculations while accurately capturing boundary layer information. We assessed the accuracy and performance of these methods on a heterogeneous CPU/GPU computing system. The reverse Cuthill-McKee algorithm significantly enhances performance, atomic operations are the optimal strategy for GPUs, and in the hybrid-precision strategy proposed in this paper, the Tesla A100 GPU, RTX 4090 GPU, and RX 7900 XTX GPU achieve overall speedup of 469, 310, and 413, respectively.</div></div>","PeriodicalId":287,"journal":{"name":"Computers & Fluids","volume":"288 ","pages":"Article 106505"},"PeriodicalIF":2.5000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Fluids","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045793024003360","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Single-precision floating-point GPU calculations in modern high-performance heterogeneous computing systems are crucial for increasing the efficiency of large-scale fluid simulations on unstructured grids. However, the lack of a unified programming language for heterogeneous systems and the significant computational errors of single-precision calculations in complex problems pose major challenges. Issues such as poor data locality and data contention in unstructured grid CFD calculations limit GPU performance. Through heterogeneous Kokkos computation, we improved data locality through data reordering and addressed data contention using the scatter-reduce strategy, atomic operations, and the color approach. We introduced an innovative hybrid-precision CFD computation strategy that leverages methods based on object distance and grid geometry for precision blending. This approach harnesses the computational advantages of single-precision GPU calculations while accurately capturing boundary layer information. We assessed the accuracy and performance of these methods on a heterogeneous CPU/GPU computing system. The reverse Cuthill-McKee algorithm significantly enhances performance, atomic operations are the optimal strategy for GPUs, and in the hybrid-precision strategy proposed in this paper, the Tesla A100 GPU, RTX 4090 GPU, and RX 7900 XTX GPU achieve overall speedup of 469, 310, and 413, respectively.
期刊介绍:
Computers & Fluids is multidisciplinary. The term ''fluid'' is interpreted in the broadest sense. Hydro- and aerodynamics, high-speed and physical gas dynamics, turbulence and flow stability, multiphase flow, rheology, tribology and fluid-structure interaction are all of interest, provided that computer technique plays a significant role in the associated studies or design methodology.