{"title":"基于GPU的高动态弱信号快速载波采集架构","authors":"Yue Guo, Rongke Liu, Yi Hou, Ling Zhao","doi":"10.1109/COMPCOMM.2016.7925033","DOIUrl":null,"url":null,"abstract":"In this paper, we present a graphics processing unit (GPU)-based architecture of carrier acquisition for high-dynamic weak signal in deep space communications. To achieve high performance, the carrier acquisition procedure is parallelized by exploiting the GPU's parallel operating characteristics. Based on computer unified device architecture (CUDA), different kernels are designed to map the different phases of carrier acquisition procedure. What's more, the kernels' efficiency are improved by optimizing the internal operation parallelism of kernels and lowering the memory access latency for threads. Besides, multiple CUDA streams are designed to hide the data transfer latency between host and device. Experimental results demonstrate that the proposed GPU-based architecture achieves more than 250.3 times speedup compared to CPU-based platform.","PeriodicalId":210833,"journal":{"name":"2016 2nd IEEE International Conference on Computer and Communications (ICCC)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fast carrier acquisition architecture for high-dynamic weak signal based on GPU\",\"authors\":\"Yue Guo, Rongke Liu, Yi Hou, Ling Zhao\",\"doi\":\"10.1109/COMPCOMM.2016.7925033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a graphics processing unit (GPU)-based architecture of carrier acquisition for high-dynamic weak signal in deep space communications. To achieve high performance, the carrier acquisition procedure is parallelized by exploiting the GPU's parallel operating characteristics. Based on computer unified device architecture (CUDA), different kernels are designed to map the different phases of carrier acquisition procedure. What's more, the kernels' efficiency are improved by optimizing the internal operation parallelism of kernels and lowering the memory access latency for threads. Besides, multiple CUDA streams are designed to hide the data transfer latency between host and device. Experimental results demonstrate that the proposed GPU-based architecture achieves more than 250.3 times speedup compared to CPU-based platform.\",\"PeriodicalId\":210833,\"journal\":{\"name\":\"2016 2nd IEEE International Conference on Computer and Communications (ICCC)\",\"volume\":\"127 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 2nd IEEE International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPCOMM.2016.7925033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd IEEE International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPCOMM.2016.7925033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A fast carrier acquisition architecture for high-dynamic weak signal based on GPU
In this paper, we present a graphics processing unit (GPU)-based architecture of carrier acquisition for high-dynamic weak signal in deep space communications. To achieve high performance, the carrier acquisition procedure is parallelized by exploiting the GPU's parallel operating characteristics. Based on computer unified device architecture (CUDA), different kernels are designed to map the different phases of carrier acquisition procedure. What's more, the kernels' efficiency are improved by optimizing the internal operation parallelism of kernels and lowering the memory access latency for threads. Besides, multiple CUDA streams are designed to hide the data transfer latency between host and device. Experimental results demonstrate that the proposed GPU-based architecture achieves more than 250.3 times speedup compared to CPU-based platform.