Enhancing computational efficiency in 3-D seismic modelling with half-precision floating-point numbers based on the curvilinear grid finite-difference method

IF 2.8 3区地球科学 Q2 GEOCHEMISTRY & GEOPHYSICS Geophysical Journal International Pub Date : 2024-07-06 DOI:10.1093/gji/ggae235

Jialiang Wan, Wenqiang Wang, Zhenguo Zhang

{"title":"Enhancing computational efficiency in 3-D seismic modelling with half-precision floating-point numbers based on the curvilinear grid finite-difference method","authors":"Jialiang Wan, Wenqiang Wang, Zhenguo Zhang","doi":"10.1093/gji/ggae235","DOIUrl":null,"url":null,"abstract":"Summary Large-scale and high-resolution seismic modelling are very significant to simulating seismic waves, evaluating earthquake hazards, and advancing exploration seismology. However, achieving high-resolution seismic modelling requires substantial computing and storage resources, resulting in a considerable computational cost. To enhance computational efficiency and performance, recent heterogeneous computing platforms, such as Nvidia Graphics Processing Units (GPUs), natively support half-precision floating-point numbers (FP16). FP16 operations can privide faster calculation speed, lower storage requirements and greater performance enhancement over single-precision floating-point numbers (FP32), thus providing significant benefits for seismic modelling. Nevertheless, the inherent limitation of fewer 16-bit representations in FP16 may lead to severe numerical overflow, underflow, and floating-point errors during computation. In this study, to ensure stable wave equation solutions and minimize the floating-point errors, we employ a scaling strategy to adjust the computation of FP16 arithmetic operations. For optimal GPU floating-point performance, we implement a 2-way single instruction multiple data (SIMD) within the floating-point units (FPUs) of CUDA cores. Moreover, we implement an earthquake simulation solver for FP16 operations based on curvilinear grid finite-difference method (CGFDM) and perform several earthquake simulations. Comparing the results of wavefield data with the standard CGFDM using FP32, the errors introduced by FP16 are minimal, demonstrating excellent consistency with the FP32 results. Performance analysis indicates that FP16 seismic modelling exhibits a remarkable improvement in computational efficiency, achieving a speedup of approximately 1.75 and reducing memory usage by half compared to the FP32 version.","PeriodicalId":12519,"journal":{"name":"Geophysical Journal International","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geophysical Journal International","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1093/gji/ggae235","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}

引用次数: 0

Abstract

Summary Large-scale and high-resolution seismic modelling are very significant to simulating seismic waves, evaluating earthquake hazards, and advancing exploration seismology. However, achieving high-resolution seismic modelling requires substantial computing and storage resources, resulting in a considerable computational cost. To enhance computational efficiency and performance, recent heterogeneous computing platforms, such as Nvidia Graphics Processing Units (GPUs), natively support half-precision floating-point numbers (FP16). FP16 operations can privide faster calculation speed, lower storage requirements and greater performance enhancement over single-precision floating-point numbers (FP32), thus providing significant benefits for seismic modelling. Nevertheless, the inherent limitation of fewer 16-bit representations in FP16 may lead to severe numerical overflow, underflow, and floating-point errors during computation. In this study, to ensure stable wave equation solutions and minimize the floating-point errors, we employ a scaling strategy to adjust the computation of FP16 arithmetic operations. For optimal GPU floating-point performance, we implement a 2-way single instruction multiple data (SIMD) within the floating-point units (FPUs) of CUDA cores. Moreover, we implement an earthquake simulation solver for FP16 operations based on curvilinear grid finite-difference method (CGFDM) and perform several earthquake simulations. Comparing the results of wavefield data with the standard CGFDM using FP32, the errors introduced by FP16 are minimal, demonstrating excellent consistency with the FP32 results. Performance analysis indicates that FP16 seismic modelling exhibits a remarkable improvement in computational efficiency, achieving a speedup of approximately 1.75 and reducing memory usage by half compared to the FP32 version.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于曲线网格有限差分法的半精度浮点数提高三维地震建模的计算效率

摘要大规模和高分辨率地震建模对于模拟地震波、评估地震灾害和推进地震勘探非常重要。然而，实现高分辨率地震建模需要大量的计算和存储资源，因此计算成本相当高。为了提高计算效率和性能，最近的异构计算平台，如英伟达图形处理器（GPU），原生支持半精度浮点数（FP16）。与单精度浮点数（FP32）相比，FP16 运算可提供更快的计算速度、更低的存储要求和更高的性能提升，从而为地震建模带来显著优势。然而，由于 FP16 的 16 位表示数较少，其固有的局限性可能会导致计算过程中出现严重的数值溢出、下溢和浮点错误。在本研究中，为了确保稳定的波方程求解并将浮点误差降至最低，我们采用了一种缩放策略来调整 FP16 算术运算的计算量。为了优化 GPU 浮点性能，我们在 CUDA 内核的浮点单元（FPU）中实施了双向单指令多数据（SIMD）。此外，我们还基于曲线网格有限差分法（CGFDM）为 FP16 运算实现了地震模拟求解器，并进行了多次地震模拟。将波场数据结果与使用 FP32 的标准 CGFDM 结果进行比较，发现 FP16 带来的误差极小，与 FP32 的结果具有极好的一致性。性能分析表明，FP16 地震建模在计算效率方面有显著提高，与 FP32 版本相比，速度提高了约 1.75 倍，内存使用量减少了一半。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Geophysical Journal International 地学-地球化学与地球物理

CiteScore

5.40

自引率

10.70%

发文量

436

审稿时长

3.3 months

期刊介绍： Geophysical Journal International publishes top quality research papers, express letters, invited review papers and book reviews on all aspects of theoretical, computational, applied and observational geophysics.