{"title":"High radix montgomery modular multiplication on FPGA","authors":"A. Mohamed, Anane Nadjia","doi":"10.1109/IDT.2013.6727148","DOIUrl":null,"url":null,"abstract":"Enhancing Montgomery modular multiplication (MMM) performances in term of speed and area is crucial for public key cryptography applications. This paper presents an efficient hardware-algorithm for a high radix MMM method that exploits the features available in the Virtex-5 Xilinx FPGA. Our main contribution in this paper is to develop hardware algorithms for radix-216 number system in the FPGA to speed up the MMM. It performs an operation of two 1024-bits numbers on 64 iterations. The CS (Carry Save) representation is advantageously used to overcome the carry propagation then the iteration cycle datapath length independent. Specials efforts were made to design, at the LUT level, the compressor 6:2, which is the key feature of our design. The resulting architecture can run with clock period equivalent to the total delay of an embedded 18×18-bits and two LUT6.","PeriodicalId":446826,"journal":{"name":"2013 8th IEEE Design and Test Symposium","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 8th IEEE Design and Test Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDT.2013.6727148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Enhancing Montgomery modular multiplication (MMM) performances in term of speed and area is crucial for public key cryptography applications. This paper presents an efficient hardware-algorithm for a high radix MMM method that exploits the features available in the Virtex-5 Xilinx FPGA. Our main contribution in this paper is to develop hardware algorithms for radix-216 number system in the FPGA to speed up the MMM. It performs an operation of two 1024-bits numbers on 64 iterations. The CS (Carry Save) representation is advantageously used to overcome the carry propagation then the iteration cycle datapath length independent. Specials efforts were made to design, at the LUT level, the compressor 6:2, which is the key feature of our design. The resulting architecture can run with clock period equivalent to the total delay of an embedded 18×18-bits and two LUT6.