{"title":"Parallel Fast Multipole Method for Potential Field Integral Equation on Sunway Supercomputer*","authors":"Wen Wang","doi":"10.1109/COMPEM.2019.8778842","DOIUrl":null,"url":null,"abstract":"Fast multipole method (FMM) is a fast, robust and accurate algorithm which is widely used in molecular dynamics, electrostatics and electromagnetics simulations. In this paper, we implemented and optimized parallel FMM for potential field integral equation on Sunway supercomputer with heterogeneous manycore processors. Two main optimization methods are proposed to improve the performance: direct memory access (DMA) and SIMD vectorization. Morton curve line cutting and local essential tree are used for parallel implementation. The speedup and parallel scalability of FMM are presented.","PeriodicalId":342849,"journal":{"name":"2019 IEEE International Conference on Computational Electromagnetics (ICCEM)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Computational Electromagnetics (ICCEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPEM.2019.8778842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Fast multipole method (FMM) is a fast, robust and accurate algorithm which is widely used in molecular dynamics, electrostatics and electromagnetics simulations. In this paper, we implemented and optimized parallel FMM for potential field integral equation on Sunway supercomputer with heterogeneous manycore processors. Two main optimization methods are proposed to improve the performance: direct memory access (DMA) and SIMD vectorization. Morton curve line cutting and local essential tree are used for parallel implementation. The speedup and parallel scalability of FMM are presented.