{"title":"Speeding up aperiodic reflectarray antenna analysis by CUDA dynamic parallelism","authors":"A. Capozzoli, C. Curcio, A. Liseno, G. Toso","doi":"10.1109/NEMO.2014.6995671","DOIUrl":null,"url":null,"abstract":"We discuss one of the computationally most critical steps of the Phase-Only synthesis of aperiodic reflectarrays, namely the fast evaluation of the radiation operator. We present its implementation by using a 2D Non-Uniform FFTs (NUFFTs) of NED (Non-Equispaced Data) type on Graphic Processing Units (GPUs) in Compute Unified Device Architecture (CUDA) language. We also illustrate the programming strategies used to speedup the code, including the use of dynamic parallelism made available by the latest architecture of CUDA cards.","PeriodicalId":273349,"journal":{"name":"2014 International Conference on Numerical Electromagnetic Modeling and Optimization for RF, Microwave, and Terahertz Applications (NEMO)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Numerical Electromagnetic Modeling and Optimization for RF, Microwave, and Terahertz Applications (NEMO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NEMO.2014.6995671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We discuss one of the computationally most critical steps of the Phase-Only synthesis of aperiodic reflectarrays, namely the fast evaluation of the radiation operator. We present its implementation by using a 2D Non-Uniform FFTs (NUFFTs) of NED (Non-Equispaced Data) type on Graphic Processing Units (GPUs) in Compute Unified Device Architecture (CUDA) language. We also illustrate the programming strategies used to speedup the code, including the use of dynamic parallelism made available by the latest architecture of CUDA cards.