Accelerating Montgomery Modulo Multiplication for Redundant Radix-64k Number System on the FPGA Using Dual-Port Block RAMs

K. Shigemoto, K. Kawakami, K. Nakano
{"title":"Accelerating Montgomery Modulo Multiplication for Redundant Radix-64k Number System on the FPGA Using Dual-Port Block RAMs","authors":"K. Shigemoto, K. Kawakami, K. Nakano","doi":"10.1109/EUC.2008.30","DOIUrl":null,"url":null,"abstract":"The main contribution of this paper is to present hardware algorithms for redundant radix-2r number system in the FPGA to accelerate Montgomery modulo multiplication with many bits, which have applications in security systems such as RSA encryption and decryption. Quite surprisingly, our hardware algorithm for Montgomery modulo multiplication of two dr-bit numbers can be completed in only d+1 clock cycles. Since most FPGAs have 18-bit multipliers and 18 k-bit block RAMs, it makes sense to let r=16. Our hardware algorithm for Montgomery modulo multiplication for 256-bit numbers runs only 17 clock cycles using redundant radix-64 k (i.e.radix-216) number system. The experimental results for Xilinx Virtex-II Pro Family FPGA XC2VP100-6 show that the clock frequency of our circuit is independent of d. Further, the hardware algorithm for 1024-bit Montgomery modulo multiplication using the redundant number system is 3 times faster than that using the conventional number system. Also, for 256-bit Montgomery modulo multiplication, our hardware algorithm runs in 0.322 mus, while a previously known implementation runs in 1.22 mus although our implementation uses less than a half slices.","PeriodicalId":430277,"journal":{"name":"2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2008.30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

The main contribution of this paper is to present hardware algorithms for redundant radix-2r number system in the FPGA to accelerate Montgomery modulo multiplication with many bits, which have applications in security systems such as RSA encryption and decryption. Quite surprisingly, our hardware algorithm for Montgomery modulo multiplication of two dr-bit numbers can be completed in only d+1 clock cycles. Since most FPGAs have 18-bit multipliers and 18 k-bit block RAMs, it makes sense to let r=16. Our hardware algorithm for Montgomery modulo multiplication for 256-bit numbers runs only 17 clock cycles using redundant radix-64 k (i.e.radix-216) number system. The experimental results for Xilinx Virtex-II Pro Family FPGA XC2VP100-6 show that the clock frequency of our circuit is independent of d. Further, the hardware algorithm for 1024-bit Montgomery modulo multiplication using the redundant number system is 3 times faster than that using the conventional number system. Also, for 256-bit Montgomery modulo multiplication, our hardware algorithm runs in 0.322 mus, while a previously known implementation runs in 1.22 mus although our implementation uses less than a half slices.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于双端口块ram的冗余基数-64k数字系统的FPGA加速Montgomery模乘法
本文的主要贡献是在FPGA中提出冗余基数-2r数系统的硬件算法,以加速多比特的Montgomery模乘法,该算法在RSA加解密等安全系统中具有应用价值。令人惊讶的是,我们的两个dr位的Montgomery模乘法的硬件算法可以在d+1时钟周期内完成。由于大多数fpga具有18位乘法器和18 k位块ram,因此让r=16是有意义的。我们的256位数字的Montgomery模乘法硬件算法使用冗余基数- 64k(即基数-216)数字系统仅运行17个时钟周期。在Xilinx Virtex-II Pro系列FPGA XC2VP100-6上的实验结果表明,该电路的时钟频率与d无关。此外,使用冗余数字系统进行1024位Montgomery模乘法的硬件算法比使用传统数字系统快3倍。此外,对于256位Montgomery模乘法,我们的硬件算法运行时间为0.322 mus,而以前已知的实现运行时间为1.22 mus,尽管我们的实现使用的切片不到一半。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automatic Integration of Non-Bus Hardware IP into SoC-Platforms for Use by Software A Self-Healing and Mutual-Healing Key Distribution Scheme Using Bilinear Pairings for Wireless Networks Simple Certificateless Signature with Smart Cards Implementation and Evaluation of SIP-Based Secure VoIP Communication System Adaptive Drowsy Cache Control for Java Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1