Parallel computation to bidimensional heat equation using MPI/CUDA and FFTW package

IF 4.7 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2024-01-11 DOI:10.3389/fcomp.2023.1305800

Tarik Chakkour

{"title":"Parallel computation to bidimensional heat equation using MPI/CUDA and FFTW package","authors":"Tarik Chakkour","doi":"10.3389/fcomp.2023.1305800","DOIUrl":null,"url":null,"abstract":"In this study, we present a fast algorithm for the numerical solution of the heat equation. The heat equation models the heat diffusion over time and through a given region. We engage a finite difference method to solve this equation numerically. The performance of its parallel implementation is considered using Message Passing Interface (MPI), Compute Unified Device Architecture (CUDA), and time schemes, such as Forward Euler (FE) and Runge-Kutta (RK) methods. The originality of this study is research on parallel implementations of the fourth-order Runge-Kutta method (RK4) for sparse matrices on Graphics Processing Unit (GPU) architecture. The supreme proprietary framework for GPU computing is CUDA, provided by NVIDIA. We will show three metrics through this parallelization to compare the computing performance: time-to-solution, speed-up, and performance. The spectral method is investigated by utilizing the FFTW software library, based on the computation of the fast Fourier transforms (FFT) in parallel and distributed memory architectures. Our CUDA-based FFT, named CUFFT, is performed in platforms, which is a highly optimized FFTW implementation. We will give numerical tests to reveal that this method is up-and-coming for solving the heat equation. The final result demonstrates that CUDA has a significant advantage and performance since the computational cost is tiny compared with the MPI implementation. This vital performance gain is also achieved through careful attention of managing memory communication and access.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":"30 11","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fcomp.2023.1305800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, we present a fast algorithm for the numerical solution of the heat equation. The heat equation models the heat diffusion over time and through a given region. We engage a finite difference method to solve this equation numerically. The performance of its parallel implementation is considered using Message Passing Interface (MPI), Compute Unified Device Architecture (CUDA), and time schemes, such as Forward Euler (FE) and Runge-Kutta (RK) methods. The originality of this study is research on parallel implementations of the fourth-order Runge-Kutta method (RK4) for sparse matrices on Graphics Processing Unit (GPU) architecture. The supreme proprietary framework for GPU computing is CUDA, provided by NVIDIA. We will show three metrics through this parallelization to compare the computing performance: time-to-solution, speed-up, and performance. The spectral method is investigated by utilizing the FFTW software library, based on the computation of the fast Fourier transforms (FFT) in parallel and distributed memory architectures. Our CUDA-based FFT, named CUFFT, is performed in platforms, which is a highly optimized FFTW implementation. We will give numerical tests to reveal that this method is up-and-coming for solving the heat equation. The final result demonstrates that CUDA has a significant advantage and performance since the computational cost is tiny compared with the MPI implementation. This vital performance gain is also achieved through careful attention of managing memory communication and access.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用 MPI/CUDA 和 FFTW 软件包并行计算二维热方程

在本研究中，我们提出了一种热方程数值求解的快速算法。热方程模拟热量随时间和通过给定区域的扩散。我们采用有限差分法对该方程进行数值求解。我们使用消息传递接口 (MPI)、计算统一设备架构 (CUDA) 和时间方案（如前向欧拉 (FE) 和 Runge-Kutta (RK) 方法）考虑了其并行实施的性能。本研究的独创性在于研究在图形处理器（GPU）架构上并行实施稀疏矩阵的四阶 Runge-Kutta 方法（RK4）。GPU 计算的最高专有框架是英伟达公司提供的 CUDA。我们将通过并行化展示三个指标来比较计算性能：求解时间、速度提升和性能。我们利用基于并行和分布式内存架构计算快速傅立叶变换（FFT）的 FFTW 软件库研究了频谱方法。我们基于 CUDA 的 FFT（名为 CUFFT）是在平台中执行的，它是高度优化的 FFTW 实现。我们将通过数值测试来揭示这种方法在求解热方程方面的最新进展。最终结果表明，CUDA 具有显著的优势和性能，因为其计算成本与 MPI 实现相比微乎其微。这一重要的性能提升还得益于对内存通信和访问的精心管理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACS Applied Bio Materials Chemistry-Chemistry (all)

CiteScore

9.40

自引率

2.10%

发文量

464

期刊介绍： ACS Applied Bio Materials is an interdisciplinary journal publishing original research covering all aspects of biomaterials and biointerfaces including and beyond the traditional biosensing, biomedical and therapeutic applications. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrates knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important bio applications. The journal is specifically interested in work that addresses the relationship between structure and function and assesses the stability and degradation of materials under relevant environmental and biological conditions.