{"title":"SMBHA:基于FPGA的系统级多核BGV硬件加速器","authors":"Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen","doi":"10.1109/TVLSI.2024.3480997","DOIUrl":null,"url":null,"abstract":"Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to <inline-formula> <tex-math>$14\\times $ </tex-math></inline-formula> higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of <inline-formula> <tex-math>$13.9\\times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$7.07\\times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$16.6\\times $ </tex-math></inline-formula>, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"546-557"},"PeriodicalIF":2.8000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SMBHA: A System-Level Multicore BGV Hardware Accelerator Based on FPGA\",\"authors\":\"Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen\",\"doi\":\"10.1109/TVLSI.2024.3480997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to <inline-formula> <tex-math>$14\\\\times $ </tex-math></inline-formula> higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of <inline-formula> <tex-math>$13.9\\\\times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$7.07\\\\times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$16.6\\\\times $ </tex-math></inline-formula>, respectively.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 2\",\"pages\":\"546-557\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10753094/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10753094/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
SMBHA: A System-Level Multicore BGV Hardware Accelerator Based on FPGA
Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to $14\times $ higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of $13.9\times $ , $7.07\times $ , and $16.6\times $ , respectively.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.