{"title":"一种基于低延迟综合征的深度学习解码器架构及其FPGA实现","authors":"E. Kavvousanos, Vassilis Paliouras","doi":"10.1109/mocast54814.2022.9837752","DOIUrl":null,"url":null,"abstract":"Recently, Machine Learning has been considered as an alternative design paradigm for various communications sub-systems. However, the works that have assessed the performance of these methods beyond the algorithmic level are limited. In this paper, we implement in hardware and evaluate the performance of the Syndrome-based Deep Learning Decoder for a BCH(63,45) code in terms of throughput rate and latency. The implemented Neural Network is compressed by applying pruning, clustering and quantization to an 8-bit fixed-point representation, with no significant loss in its BER performance, while achieving 90% weight sparsity in each layer. An FPGA architecture is designed for the decoder which exploits the compressed structure of the Neural Network in order to accelerate the underlying computations with moderate hardware requirements. Experimental results are provided which show that the decoder achieves latency less than a tenth of a millisecond and a throughput rate up to 5 Mbps, substantially outperforming previous implementations by 30×.","PeriodicalId":122414,"journal":{"name":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Low-Latency Syndrome-based Deep Learning Decoder Architecture and its FPGA Implementation\",\"authors\":\"E. Kavvousanos, Vassilis Paliouras\",\"doi\":\"10.1109/mocast54814.2022.9837752\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, Machine Learning has been considered as an alternative design paradigm for various communications sub-systems. However, the works that have assessed the performance of these methods beyond the algorithmic level are limited. In this paper, we implement in hardware and evaluate the performance of the Syndrome-based Deep Learning Decoder for a BCH(63,45) code in terms of throughput rate and latency. The implemented Neural Network is compressed by applying pruning, clustering and quantization to an 8-bit fixed-point representation, with no significant loss in its BER performance, while achieving 90% weight sparsity in each layer. An FPGA architecture is designed for the decoder which exploits the compressed structure of the Neural Network in order to accelerate the underlying computations with moderate hardware requirements. Experimental results are provided which show that the decoder achieves latency less than a tenth of a millisecond and a throughput rate up to 5 Mbps, substantially outperforming previous implementations by 30×.\",\"PeriodicalId\":122414,\"journal\":{\"name\":\"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/mocast54814.2022.9837752\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mocast54814.2022.9837752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Low-Latency Syndrome-based Deep Learning Decoder Architecture and its FPGA Implementation
Recently, Machine Learning has been considered as an alternative design paradigm for various communications sub-systems. However, the works that have assessed the performance of these methods beyond the algorithmic level are limited. In this paper, we implement in hardware and evaluate the performance of the Syndrome-based Deep Learning Decoder for a BCH(63,45) code in terms of throughput rate and latency. The implemented Neural Network is compressed by applying pruning, clustering and quantization to an 8-bit fixed-point representation, with no significant loss in its BER performance, while achieving 90% weight sparsity in each layer. An FPGA architecture is designed for the decoder which exploits the compressed structure of the Neural Network in order to accelerate the underlying computations with moderate hardware requirements. Experimental results are provided which show that the decoder achieves latency less than a tenth of a millisecond and a throughput rate up to 5 Mbps, substantially outperforming previous implementations by 30×.