{"title":"Parallel Algorithms of Well-Balanced and Weighted Average Flux for Shallow Water Model Using CUDA","authors":"Nugool Sataporn, W. Suwannik, M. Maleewong","doi":"10.1155/2021/9534495","DOIUrl":null,"url":null,"abstract":"Compute Unified Device Architecture (CUDA) implementations are presented of a well-balanced finite volume method for solving a shallow water model. The CUDA platform allows programs to run parallel on GPU. Four versions of the CUDA algorithm are presented in addition to a CPU implementation. Each version is improved from the previous one. We present the following techniques for optimizing a CUDA program: limiting register usage, changing the global memory access pattern, and using loop unroll. The accuracy of all programs is investigated in 3 test cases: a circular dam break on a dry bed, a circular dam break on a wet bed, and a dam break flow over three humps. The last parallel version shows 3.84x speedup over the first CUDA implementation. We use our program to simulate a real-world problem based on an assumed partial breakage of the Srinakarin Dam located in Kanchanaburi province, Thailand. The simulation shows that the strong interaction between massive water flows and bottom elevations under wet and dry conditions is well captured by the well-balanced scheme, while the optimized parallel program produces a 57.32x speedup over the serial version.","PeriodicalId":45541,"journal":{"name":"Modelling and Simulation in Engineering","volume":"10 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Modelling and Simulation in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2021/9534495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1
Abstract
Compute Unified Device Architecture (CUDA) implementations are presented of a well-balanced finite volume method for solving a shallow water model. The CUDA platform allows programs to run parallel on GPU. Four versions of the CUDA algorithm are presented in addition to a CPU implementation. Each version is improved from the previous one. We present the following techniques for optimizing a CUDA program: limiting register usage, changing the global memory access pattern, and using loop unroll. The accuracy of all programs is investigated in 3 test cases: a circular dam break on a dry bed, a circular dam break on a wet bed, and a dam break flow over three humps. The last parallel version shows 3.84x speedup over the first CUDA implementation. We use our program to simulate a real-world problem based on an assumed partial breakage of the Srinakarin Dam located in Kanchanaburi province, Thailand. The simulation shows that the strong interaction between massive water flows and bottom elevations under wet and dry conditions is well captured by the well-balanced scheme, while the optimized parallel program produces a 57.32x speedup over the serial version.
期刊介绍:
Modelling and Simulation in Engineering aims at providing a forum for the discussion of formalisms, methodologies and simulation tools that are intended to support the new, broader interpretation of Engineering. Competitive pressures of Global Economy have had a profound effect on the manufacturing in Europe, Japan and the USA with much of the production being outsourced. In this context the traditional interpretation of engineering profession linked to the actual manufacturing needs to be broadened to include the integration of outsourced components and the consideration of logistic, economical and human factors in the design of engineering products and services.