Edward Buckland, Vinh Phu Nguyen, Alban de Vaucorbeil
{"title":"轻松将材料点方法代码移植到 GPU","authors":"Edward Buckland, Vinh Phu Nguyen, Alban de Vaucorbeil","doi":"10.1007/s40571-024-00768-1","DOIUrl":null,"url":null,"abstract":"<div><p>The material point method (MPM) is computationally costly and highly parallelisable. With the plateauing of Moore’s law and recent advances in parallel computing, scientists without formal programming training might face challenges in developing fast scientific codes for their research. Parallel programming is intrinsically different to serial programming and may seem daunting to certain scientists, in particular for GPUs. However, recent developments in GPU application programming interfaces (APIs) have made it easier than ever to port codes to GPU. This paper explains how we ported our modular C++ MPM code <span>Karamelo</span> to GPU without using low-level hardware APIs like CUDA or OpenCL. We aimed to develop a code that has abstracted parallelism and is therefore hardware agnostic. We first present an investigation of a variety of GPU APIs, comparing ease of use, hardware support and performance in an MPM context. Then, the porting process of <span>Karamelo</span> to the Kokkos ecosystem is detailed, discussing key design patterns and challenges. Finally, our parallel C++ code running on GPU is shown to be up to 85 times faster than on CPU. Since Kokkos also supports Python and Fortran, the principles presented therein can also be applied to codes written in these languages.</p></div>","PeriodicalId":524,"journal":{"name":"Computational Particle Mechanics","volume":"11 5","pages":"2127 - 2142"},"PeriodicalIF":2.8000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40571-024-00768-1.pdf","citationCount":"0","resultStr":"{\"title\":\"Easily porting material point methods codes to GPU\",\"authors\":\"Edward Buckland, Vinh Phu Nguyen, Alban de Vaucorbeil\",\"doi\":\"10.1007/s40571-024-00768-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The material point method (MPM) is computationally costly and highly parallelisable. With the plateauing of Moore’s law and recent advances in parallel computing, scientists without formal programming training might face challenges in developing fast scientific codes for their research. Parallel programming is intrinsically different to serial programming and may seem daunting to certain scientists, in particular for GPUs. However, recent developments in GPU application programming interfaces (APIs) have made it easier than ever to port codes to GPU. This paper explains how we ported our modular C++ MPM code <span>Karamelo</span> to GPU without using low-level hardware APIs like CUDA or OpenCL. We aimed to develop a code that has abstracted parallelism and is therefore hardware agnostic. We first present an investigation of a variety of GPU APIs, comparing ease of use, hardware support and performance in an MPM context. Then, the porting process of <span>Karamelo</span> to the Kokkos ecosystem is detailed, discussing key design patterns and challenges. Finally, our parallel C++ code running on GPU is shown to be up to 85 times faster than on CPU. Since Kokkos also supports Python and Fortran, the principles presented therein can also be applied to codes written in these languages.</p></div>\",\"PeriodicalId\":524,\"journal\":{\"name\":\"Computational Particle Mechanics\",\"volume\":\"11 5\",\"pages\":\"2127 - 2142\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s40571-024-00768-1.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Particle Mechanics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s40571-024-00768-1\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Particle Mechanics","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s40571-024-00768-1","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
材料点法(MPM)计算成本高,并行性强。随着摩尔定律趋于稳定以及并行计算的最新进展,没有接受过正规编程培训的科学家在为其研究开发快速科学代码时可能会面临挑战。并行编程在本质上不同于串行编程,对于某些科学家来说可能会望而生畏,尤其是对于 GPU 而言。然而,GPU 应用编程接口(API)的最新发展使得将代码移植到 GPU 比以往任何时候都更加容易。本文介绍了我们如何在不使用 CUDA 或 OpenCL 等低级硬件 API 的情况下将模块化 C++ MPM 代码 Karamelo 移植到 GPU。我们的目标是开发一种具有抽象并行性的代码,因此与硬件无关。我们首先对各种 GPU API 进行了调查,比较了在 MPM 环境下的易用性、硬件支持和性能。然后,详细介绍将 Karamelo 移植到 Kokkos 生态系统的过程,讨论关键设计模式和挑战。最后,我们在 GPU 上运行的并行 C++ 代码比在 CPU 上运行的速度快 85 倍。由于 Kokkos 还支持 Python 和 Fortran,因此其中介绍的原理也可应用于用这些语言编写的代码。
Easily porting material point methods codes to GPU
The material point method (MPM) is computationally costly and highly parallelisable. With the plateauing of Moore’s law and recent advances in parallel computing, scientists without formal programming training might face challenges in developing fast scientific codes for their research. Parallel programming is intrinsically different to serial programming and may seem daunting to certain scientists, in particular for GPUs. However, recent developments in GPU application programming interfaces (APIs) have made it easier than ever to port codes to GPU. This paper explains how we ported our modular C++ MPM code Karamelo to GPU without using low-level hardware APIs like CUDA or OpenCL. We aimed to develop a code that has abstracted parallelism and is therefore hardware agnostic. We first present an investigation of a variety of GPU APIs, comparing ease of use, hardware support and performance in an MPM context. Then, the porting process of Karamelo to the Kokkos ecosystem is detailed, discussing key design patterns and challenges. Finally, our parallel C++ code running on GPU is shown to be up to 85 times faster than on CPU. Since Kokkos also supports Python and Fortran, the principles presented therein can also be applied to codes written in these languages.
期刊介绍:
GENERAL OBJECTIVES: Computational Particle Mechanics (CPM) is a quarterly journal with the goal of publishing full-length original articles addressing the modeling and simulation of systems involving particles and particle methods. The goal is to enhance communication among researchers in the applied sciences who use "particles'''' in one form or another in their research.
SPECIFIC OBJECTIVES: Particle-based materials and numerical methods have become wide-spread in the natural and applied sciences, engineering, biology. The term "particle methods/mechanics'''' has now come to imply several different things to researchers in the 21st century, including:
(a) Particles as a physical unit in granular media, particulate flows, plasmas, swarms, etc.,
(b) Particles representing material phases in continua at the meso-, micro-and nano-scale and
(c) Particles as a discretization unit in continua and discontinua in numerical methods such as
Discrete Element Methods (DEM), Particle Finite Element Methods (PFEM), Molecular Dynamics (MD), and Smoothed Particle Hydrodynamics (SPH), to name a few.