{"title":"CRYSTALS-Kyber 的轻量级硬件实现","authors":"Shiyang He , Hui Li , Fenghua Li , Ruhui Ma","doi":"10.1016/j.jiixd.2024.02.004","DOIUrl":null,"url":null,"abstract":"<div><p>The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future. Since December 2016, the National Institute of Standards and Technology (NIST) has begun to solicit post-quantum cryptographic (PQC) algorithms worldwide. CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation. Meanwhile considering the large resource consumption of current implementation, this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping. In this implementation, a novel compact modular multiplication unit (MMU) and compression/decompression module is proposed to save hardware resources. We put forward a specially optimized schoolbook polynomial multiplication (SPM) instead of number theoretic transform (NTT) core for polynomial multiplication, which can reduce about 74% SLICE cost. We also use signed number representation to save memory resources. In addition, we optimize the hardware implementation of the Hash module, which cuts off about 48% of FF consumption by register reuse technology. Our design can be implemented on Kintex-7 (XC7K325T-2FFG900I) FPGA for prototyping, which occupations of 4777/4993 LUTs, 2661/2765 FFs, 1395/1452 SLICEs, 2.5/2.5 BRAMs, and 0/0 DSP respective of client/server side. The maximum clock frequency can reach at 244 MHz. As far as we know, our design consumes the least resources compared with other existing designs, which is very friendly to resource-constrained devices.</p></div>","PeriodicalId":100790,"journal":{"name":"Journal of Information and Intelligence","volume":"2 2","pages":"Pages 167-176"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971592400009X/pdfft?md5=554b4ca1fa191ff4a92f726744e62d79&pid=1-s2.0-S294971592400009X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A lightweight hardware implementation of CRYSTALS-Kyber\",\"authors\":\"Shiyang He , Hui Li , Fenghua Li , Ruhui Ma\",\"doi\":\"10.1016/j.jiixd.2024.02.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future. Since December 2016, the National Institute of Standards and Technology (NIST) has begun to solicit post-quantum cryptographic (PQC) algorithms worldwide. CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation. Meanwhile considering the large resource consumption of current implementation, this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping. In this implementation, a novel compact modular multiplication unit (MMU) and compression/decompression module is proposed to save hardware resources. We put forward a specially optimized schoolbook polynomial multiplication (SPM) instead of number theoretic transform (NTT) core for polynomial multiplication, which can reduce about 74% SLICE cost. We also use signed number representation to save memory resources. In addition, we optimize the hardware implementation of the Hash module, which cuts off about 48% of FF consumption by register reuse technology. Our design can be implemented on Kintex-7 (XC7K325T-2FFG900I) FPGA for prototyping, which occupations of 4777/4993 LUTs, 2661/2765 FFs, 1395/1452 SLICEs, 2.5/2.5 BRAMs, and 0/0 DSP respective of client/server side. The maximum clock frequency can reach at 244 MHz. As far as we know, our design consumes the least resources compared with other existing designs, which is very friendly to resource-constrained devices.</p></div>\",\"PeriodicalId\":100790,\"journal\":{\"name\":\"Journal of Information and Intelligence\",\"volume\":\"2 2\",\"pages\":\"Pages 167-176\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S294971592400009X/pdfft?md5=554b4ca1fa191ff4a92f726744e62d79&pid=1-s2.0-S294971592400009X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information and Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S294971592400009X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294971592400009X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A lightweight hardware implementation of CRYSTALS-Kyber
The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future. Since December 2016, the National Institute of Standards and Technology (NIST) has begun to solicit post-quantum cryptographic (PQC) algorithms worldwide. CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation. Meanwhile considering the large resource consumption of current implementation, this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping. In this implementation, a novel compact modular multiplication unit (MMU) and compression/decompression module is proposed to save hardware resources. We put forward a specially optimized schoolbook polynomial multiplication (SPM) instead of number theoretic transform (NTT) core for polynomial multiplication, which can reduce about 74% SLICE cost. We also use signed number representation to save memory resources. In addition, we optimize the hardware implementation of the Hash module, which cuts off about 48% of FF consumption by register reuse technology. Our design can be implemented on Kintex-7 (XC7K325T-2FFG900I) FPGA for prototyping, which occupations of 4777/4993 LUTs, 2661/2765 FFs, 1395/1452 SLICEs, 2.5/2.5 BRAMs, and 0/0 DSP respective of client/server side. The maximum clock frequency can reach at 244 MHz. As far as we know, our design consumes the least resources compared with other existing designs, which is very friendly to resource-constrained devices.