Doru Thom Popovici, Mauro del Ben, Osni Marques, Andrew Canning
{"title":"Flexible Multi-Dimensional FFTs for Plane Wave Density Functional Theory Codes","authors":"Doru Thom Popovici, Mauro del Ben, Osni Marques, Andrew Canning","doi":"arxiv-2406.05577","DOIUrl":null,"url":null,"abstract":"Multi-dimensional Fourier transforms are key mathematical building blocks\nthat appear in a wide range of applications from materials science, physics,\nchemistry and even machine learning. Over the past years, a multitude of\nsoftware packages targeting distributed multi-dimensional Fourier transforms\nhave been developed. Most variants attempt to offer efficient implementations\nfor single transforms applied on data mapped onto rectangular grids. However,\nnot all scientific applications conform to this pattern, i.e. plane wave\nDensity Functional Theory codes require multi-dimensional Fourier transforms\napplied on data represented as batches of spheres. Typically, the\nimplementations for this use case are hand-coded and tailored for the\nrequirements of each application. In this work, we present the Fastest Fourier\nTransform from Berkeley (FFTB) a distributed framework that offers flexible\nimplementations for both regular/non-regular data grids and batched/non-batched\ntransforms. We provide a flexible implementations with a user-friendly API that\ncaptures most of the use cases. Furthermore, we provide implementations for\nboth CPU and GPU platforms, showing that our approach offers improved execution\ntime and scalability on the HP Cray EX supercomputer. In addition, we outline\nthe need for flexible implementations for different use cases of the software\npackage.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.05577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-dimensional Fourier transforms are key mathematical building blocks
that appear in a wide range of applications from materials science, physics,
chemistry and even machine learning. Over the past years, a multitude of
software packages targeting distributed multi-dimensional Fourier transforms
have been developed. Most variants attempt to offer efficient implementations
for single transforms applied on data mapped onto rectangular grids. However,
not all scientific applications conform to this pattern, i.e. plane wave
Density Functional Theory codes require multi-dimensional Fourier transforms
applied on data represented as batches of spheres. Typically, the
implementations for this use case are hand-coded and tailored for the
requirements of each application. In this work, we present the Fastest Fourier
Transform from Berkeley (FFTB) a distributed framework that offers flexible
implementations for both regular/non-regular data grids and batched/non-batched
transforms. We provide a flexible implementations with a user-friendly API that
captures most of the use cases. Furthermore, we provide implementations for
both CPU and GPU platforms, showing that our approach offers improved execution
time and scalability on the HP Cray EX supercomputer. In addition, we outline
the need for flexible implementations for different use cases of the software
package.
多维傅立叶变换是关键的数学构件,广泛应用于材料科学、物理学、化学甚至机器学习等领域。在过去几年中,针对分布式多维傅立叶变换开发了大量软件包。大多数变体试图为映射到矩形网格上的数据提供单次变换的高效实现。然而,并非所有科学应用都符合这种模式,例如,平面波密度函数论代码需要对以球体批次表示的数据进行多维傅里叶变换。通常情况下,这种用例的实现都是手工编码,并根据每个应用的需求量身定制。在这项工作中,我们提出了伯克利最快傅立叶变换(FFTB),这是一个分布式框架,可为规则/非规则数据网格和成批/非成批变换提供灵活的实现方法。我们通过用户友好的应用程序接口(API)提供了灵活的实现方式,可以满足大多数用例。此外,我们还提供了 CPU 和 GPU 平台的实现方法,表明我们的方法在 HP Cray EX 超级计算机上的执行时间和可扩展性都得到了改善。此外,我们还概述了针对软件包的不同用例进行灵活实现的必要性。