One-Way Wave Equation Migration at Scale on GPUs Using Directive Based Programming

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2017-05-01 DOI:10.1109/IPDPS.2017.82

Kshitij Mehta, M. Hugues, Oscar R. Hernandez, D. Bernholdt, H. Calandra

{"title":"One-Way Wave Equation Migration at Scale on GPUs Using Directive Based Programming","authors":"Kshitij Mehta, M. Hugues, Oscar R. Hernandez, D. Bernholdt, H. Calandra","doi":"10.1109/IPDPS.2017.82","DOIUrl":null,"url":null,"abstract":"One-Way Wave Equation Migration (OWEM) is a depth migration algorithm used for seismic imaging. A parallel version of this algorithm is widely implemented using MPI. Heterogenous architectures that use GPUs have become popular in the Top 500 because of their performance/power ratio. In this paper, we discuss the methodology and code transformations used to port OWEM to GPUs using OpenACC, along with the code changes needed for scaling the application up to 18,400 GPUs (more than 98%) of the Titan leadership class supercomputer at Oak Ridget National Laboratory. For the individual OpenACC kernels, we achieved an average of 3X speedup on a test dataset using one GPU as compared with an 8-core Intel Sandy Bridge CPU. The application was then run at large scale on the Titan supercomputer achieving a peak of 1.2 petaflops using an average of 5.5 megawatts. After porting the application to GPUs, we discuss how we dealt with other challenges of running at scale such as the application becoming more I/O bound and prone to silent errors. We believe this work will serve as valuable proof that directive-based programming models are a viable option for scaling HPC applications to heterogenous architectures.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"157 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

One-Way Wave Equation Migration (OWEM) is a depth migration algorithm used for seismic imaging. A parallel version of this algorithm is widely implemented using MPI. Heterogenous architectures that use GPUs have become popular in the Top 500 because of their performance/power ratio. In this paper, we discuss the methodology and code transformations used to port OWEM to GPUs using OpenACC, along with the code changes needed for scaling the application up to 18,400 GPUs (more than 98%) of the Titan leadership class supercomputer at Oak Ridget National Laboratory. For the individual OpenACC kernels, we achieved an average of 3X speedup on a test dataset using one GPU as compared with an 8-core Intel Sandy Bridge CPU. The application was then run at large scale on the Titan supercomputer achieving a peak of 1.2 petaflops using an average of 5.5 megawatts. After porting the application to GPUs, we discuss how we dealt with other challenges of running at scale such as the application becoming more I/O bound and prone to silent errors. We believe this work will serve as valuable proof that directive-based programming models are a viable option for scaling HPC applications to heterogenous architectures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于指令编程的gpu单向波动方程大规模迁移

单向波动方程偏移(OWEM)是一种用于地震成像的深度偏移算法。该算法的并行版本被广泛使用MPI实现。使用gpu的异构架构由于其性能/功耗比而在500强中变得流行。在本文中，我们讨论了使用OpenACC将OWEM移植到gpu的方法和代码转换，以及将应用程序扩展到橡树岭国家实验室的泰坦领导级超级计算机的18,400个gpu(超过98%)所需的代码更改。对于单个OpenACC内核，我们在使用一个GPU的测试数据集上实现了与8核英特尔Sandy Bridge CPU相比平均3倍的加速。然后，该应用程序在泰坦超级计算机上大规模运行，达到每秒1.2千万亿次的峰值，平均使用5.5兆瓦的功率。在将应用程序移植到gpu之后，我们将讨论如何处理大规模运行的其他挑战，例如应用程序变得越来越受I/O限制，并且容易出现无声错误。我们相信这项工作将作为有价值的证据，证明基于指令的编程模型是将HPC应用程序扩展到异构架构的可行选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量