Hoai Luan Pham;Vu Trung Duong Le;Van Duy Tran;Tuan Hai Vu;Yasuhiko Nakashima
{"title":"LiCryptor: High-Speed and Compact Multi-Grained Reconfigurable Accelerator for Lightweight Cryptography","authors":"Hoai Luan Pham;Vu Trung Duong Le;Van Duy Tran;Tuan Hai Vu;Yasuhiko Nakashima","doi":"10.1109/TCSI.2024.3434686","DOIUrl":null,"url":null,"abstract":"Emerging modern internet-of-things (IoT) systems require hardware development to support multiple 8/32/64-bit lightweight cryptographic (LWC) algorithms with high speed and energy efficiency to ensure diverse security requirements. Accordingly, a coarse-grained reconfigurable array (CGRA) is considered the most effective architecture for achieving high speed, low power, and high flexibility for implementing LWC algorithms. However, existing CGRA designs for cryptography focus only on improvements to outdated 8/32-bit algorithms, suffer from large area requirements, and have long compilation times. To address these issues, this paper proposes a new CGRA-based accelerator named LiCryptor to support various 8/32/64-bit LWC algorithms with high speed and small area. Three innovative ideas are proposed to enable LiCryptor to achieve these goals: a compact multi-grained processing element array (M-PEA), a shared 8/32/64-bit arithmetic logic unit (ALU), and an assembly-like inline directive (AID) mapping method. The LiCryptor has been successfully implemented and verified on the Xilinx ZCU102 FPGA. Real-time performance evaluation across various LWC algorithms on FPGA shows that LiCryptor is 1.33 to 4 times better in execution time and 3.4 to 153 times better in power-delay products (PDP) compared to today’s most powerful CPUs. Notably, evaluation of AID mapping on the ARM Cortex-A53 CPU of the ZCU102 FPGA shows that its compilation time is less than 1.5 ms for most LWC algorithms, at least 2,333 times faster than CFG mapping in current CGRAs. Moreover, experimental results on 45nm ASIC technology show that the LiCryptor significantly outperforms existing CGRAs and other reconfigurable designs in terms of throughput and area efficiency.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 10","pages":"4624-4637"},"PeriodicalIF":5.2000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems I: Regular Papers","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10620019/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Emerging modern internet-of-things (IoT) systems require hardware development to support multiple 8/32/64-bit lightweight cryptographic (LWC) algorithms with high speed and energy efficiency to ensure diverse security requirements. Accordingly, a coarse-grained reconfigurable array (CGRA) is considered the most effective architecture for achieving high speed, low power, and high flexibility for implementing LWC algorithms. However, existing CGRA designs for cryptography focus only on improvements to outdated 8/32-bit algorithms, suffer from large area requirements, and have long compilation times. To address these issues, this paper proposes a new CGRA-based accelerator named LiCryptor to support various 8/32/64-bit LWC algorithms with high speed and small area. Three innovative ideas are proposed to enable LiCryptor to achieve these goals: a compact multi-grained processing element array (M-PEA), a shared 8/32/64-bit arithmetic logic unit (ALU), and an assembly-like inline directive (AID) mapping method. The LiCryptor has been successfully implemented and verified on the Xilinx ZCU102 FPGA. Real-time performance evaluation across various LWC algorithms on FPGA shows that LiCryptor is 1.33 to 4 times better in execution time and 3.4 to 153 times better in power-delay products (PDP) compared to today’s most powerful CPUs. Notably, evaluation of AID mapping on the ARM Cortex-A53 CPU of the ZCU102 FPGA shows that its compilation time is less than 1.5 ms for most LWC algorithms, at least 2,333 times faster than CFG mapping in current CGRAs. Moreover, experimental results on 45nm ASIC technology show that the LiCryptor significantly outperforms existing CGRAs and other reconfigurable designs in terms of throughput and area efficiency.
期刊介绍:
TCAS I publishes regular papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: - Circuits: Analog, Digital and Mixed Signal Circuits and Systems - Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic - Circuits and Systems, Power Electronics and Systems - Software for Analog-and-Logic Circuits and Systems - Control aspects of Circuits and Systems.