{"title":"MBSNTT: A Highly Parallel Digital In-Memory Bit-Serial Number Theoretic Transform Accelerator","authors":"Akhil Pakala;Zhiyu Chen;Kaiyuan Yang","doi":"10.1109/TVLSI.2024.3462955","DOIUrl":null,"url":null,"abstract":"Conventional cryptographic systems protect the data security during communication but give third-party cloud operators complete access to compute decrypted user data. Homomorphic encryption (HE) promises to rectify this and allow computations on encrypted data to be done without actually decrypting it. However, HE encryption requires several orders of magnitude higher latency than conventional encryption schemes. Number theoretic transform (NTT), a polynomial multiplication algorithm, is the bottleneck function in HE. In traditional architectures, memory accesses and support for parallel operations limit NTT’s throughput and energy efficiency. Processing in memory (PIM) is an interesting approach that can maximize parallelism with high-energy efficiency. To enable HE on resource-constrained edge devices, this article presents MBSNTT, a digital in-memory Multi-Bit-Serial NTT accelerator, achieving high parallelism and energy efficiency for NTT with minimized area. MBSNTT features a novel multi-bit-serial modular multiplication algorithm and PIM implementation that computes all modular multiplications in an NTT in parallel. It further adopts a constant geometry NTT data flow for efficient transition between NTT stages and different cores. Our evaluation shows that MBSNTT achieves <inline-formula> <tex-math>$1.62\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$19.08\\times $ </tex-math></inline-formula>) higher throughput and <inline-formula> <tex-math>$64.9\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$2.06\\times $ </tex-math></inline-formula>) lower energy than state-of-the-art PIM NTT accelerators Crypto-PIM (MeNTT), at a polynomial order of 8 K and bit width of 128.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"537-545"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10695040/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Conventional cryptographic systems protect the data security during communication but give third-party cloud operators complete access to compute decrypted user data. Homomorphic encryption (HE) promises to rectify this and allow computations on encrypted data to be done without actually decrypting it. However, HE encryption requires several orders of magnitude higher latency than conventional encryption schemes. Number theoretic transform (NTT), a polynomial multiplication algorithm, is the bottleneck function in HE. In traditional architectures, memory accesses and support for parallel operations limit NTT’s throughput and energy efficiency. Processing in memory (PIM) is an interesting approach that can maximize parallelism with high-energy efficiency. To enable HE on resource-constrained edge devices, this article presents MBSNTT, a digital in-memory Multi-Bit-Serial NTT accelerator, achieving high parallelism and energy efficiency for NTT with minimized area. MBSNTT features a novel multi-bit-serial modular multiplication algorithm and PIM implementation that computes all modular multiplications in an NTT in parallel. It further adopts a constant geometry NTT data flow for efficient transition between NTT stages and different cores. Our evaluation shows that MBSNTT achieves $1.62\times $ ($19.08\times $ ) higher throughput and $64.9\times $ ($2.06\times $ ) lower energy than state-of-the-art PIM NTT accelerators Crypto-PIM (MeNTT), at a polynomial order of 8 K and bit width of 128.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.