{"title":"现代架构上内核独立快速多极子方法的 M2L 转换算子","authors":"Srinath Kailasa, Timo Betcke, Sarah El Kazdadi","doi":"arxiv-2408.07436","DOIUrl":null,"url":null,"abstract":"Current and future trends in computer hardware, in which the disparity\nbetween available flops and memory bandwidth continues to grow, favour\nalgorithm implementations which minimise data movement even at the cost of more\nflops. In this study we review the requirements for high performance\nimplementations of the kernel independent Fast Multipole Method (kiFMM), a\nvariant of the crucial FMM algorithm for the rapid evaluation of N-body\npotential problems. Performant implementations of the kiFMM typically rely on\nFast Fourier Transforms for the crucial M2L (Multipole-to-Local) operation.\nHowever, in recent years for other FMM variants such as the black-box FMM also\nBLAS based M2L translation operators have become popular that rely on direct\nmatrix compression techniques. In this paper we present algorithmic\nimprovements for BLAS based M2L translation operator and benchmark them against\nFFT based M2L translation operators. In order to allow a fair comparison we\nhave implemented our own high-performance kiFMM algorithm in Rust that performs\ncompetitively against other implementations, and allows us to flexibly switch\nbetween BLAS and FFT based translation operators.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"M2L Translation Operators for Kernel Independent Fast Multipole Methods on Modern Architectures\",\"authors\":\"Srinath Kailasa, Timo Betcke, Sarah El Kazdadi\",\"doi\":\"arxiv-2408.07436\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current and future trends in computer hardware, in which the disparity\\nbetween available flops and memory bandwidth continues to grow, favour\\nalgorithm implementations which minimise data movement even at the cost of more\\nflops. In this study we review the requirements for high performance\\nimplementations of the kernel independent Fast Multipole Method (kiFMM), a\\nvariant of the crucial FMM algorithm for the rapid evaluation of N-body\\npotential problems. Performant implementations of the kiFMM typically rely on\\nFast Fourier Transforms for the crucial M2L (Multipole-to-Local) operation.\\nHowever, in recent years for other FMM variants such as the black-box FMM also\\nBLAS based M2L translation operators have become popular that rely on direct\\nmatrix compression techniques. In this paper we present algorithmic\\nimprovements for BLAS based M2L translation operator and benchmark them against\\nFFT based M2L translation operators. In order to allow a fair comparison we\\nhave implemented our own high-performance kiFMM algorithm in Rust that performs\\ncompetitively against other implementations, and allows us to flexibly switch\\nbetween BLAS and FFT based translation operators.\",\"PeriodicalId\":501309,\"journal\":{\"name\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07436\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
M2L Translation Operators for Kernel Independent Fast Multipole Methods on Modern Architectures
Current and future trends in computer hardware, in which the disparity
between available flops and memory bandwidth continues to grow, favour
algorithm implementations which minimise data movement even at the cost of more
flops. In this study we review the requirements for high performance
implementations of the kernel independent Fast Multipole Method (kiFMM), a
variant of the crucial FMM algorithm for the rapid evaluation of N-body
potential problems. Performant implementations of the kiFMM typically rely on
Fast Fourier Transforms for the crucial M2L (Multipole-to-Local) operation.
However, in recent years for other FMM variants such as the black-box FMM also
BLAS based M2L translation operators have become popular that rely on direct
matrix compression techniques. In this paper we present algorithmic
improvements for BLAS based M2L translation operator and benchmark them against
FFT based M2L translation operators. In order to allow a fair comparison we
have implemented our own high-performance kiFMM algorithm in Rust that performs
competitively against other implementations, and allows us to flexibly switch
between BLAS and FFT based translation operators.