修改蒙哥马利模块化乘法使用4:2压缩器和CSA加法器

Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06) Pub Date : 2006-01-17 DOI:10.1109/DELTA.2006.70

H. Thapliyal, Anvesh Ramasahayam, V. Kotha, Kunul Gottimukkula, M. Srinivas

{"title":"修改蒙哥马利模块化乘法使用4:2压缩器和CSA加法器","authors":"H. Thapliyal, Anvesh Ramasahayam, V. Kotha, Kunul Gottimukkula, M. Srinivas","doi":"10.1109/DELTA.2006.70","DOIUrl":null,"url":null,"abstract":"The efficiency of the public key encryption systems like RSA and ECC can be improved with the adoption of a faster multiplication scheme. In this paper, Modified Montgomery multiplications and circuit architectures are presented. The first modified Montgomery multiplier uses 4:2 compressor and carry save adders (CSA) to perform large word length additions. The total delay for a single modular multiplication using the proposed approach is 7XOR+1 AND gate compared to 8XOR+1AND gate of the recently proposed fastest algorithm. The second modified Montgomery multiplier uses a novel proposed hardware unit that outputs carry save representation of the 4-input operands in 3XOR delays. The total delay for a single modular multiplication using the novel hardware unit is 5XOR+1 AND gate compared to 6XOR+1AND gate of the recently proposed algorithm. The optimal transistor implementations of the proposed approaches have also been presented. The proposed transistor implementations are highly optimized in terms of area, speed and low power. The proposed Montgomery multiplication circuit will be of eminent importance when implemented for higher word length such as 1024 and 2048 as there will be saving in the propagation delays by 1024 and 2048 XOR gates respectively compared to the recently proposed fastest algorithm.","PeriodicalId":439448,"journal":{"name":"Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Modified Montgomery modular multiplication using 4:2 compressor and CSA adder\",\"authors\":\"H. Thapliyal, Anvesh Ramasahayam, V. Kotha, Kunul Gottimukkula, M. Srinivas\",\"doi\":\"10.1109/DELTA.2006.70\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The efficiency of the public key encryption systems like RSA and ECC can be improved with the adoption of a faster multiplication scheme. In this paper, Modified Montgomery multiplications and circuit architectures are presented. The first modified Montgomery multiplier uses 4:2 compressor and carry save adders (CSA) to perform large word length additions. The total delay for a single modular multiplication using the proposed approach is 7XOR+1 AND gate compared to 8XOR+1AND gate of the recently proposed fastest algorithm. The second modified Montgomery multiplier uses a novel proposed hardware unit that outputs carry save representation of the 4-input operands in 3XOR delays. The total delay for a single modular multiplication using the novel hardware unit is 5XOR+1 AND gate compared to 6XOR+1AND gate of the recently proposed algorithm. The optimal transistor implementations of the proposed approaches have also been presented. The proposed transistor implementations are highly optimized in terms of area, speed and low power. The proposed Montgomery multiplication circuit will be of eminent importance when implemented for higher word length such as 1024 and 2048 as there will be saving in the propagation delays by 1024 and 2048 XOR gates respectively compared to the recently proposed fastest algorithm.\",\"PeriodicalId\":439448,\"journal\":{\"name\":\"Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06)\",\"volume\":\"127 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DELTA.2006.70\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DELTA.2006.70","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

采用更快的乘法方案可以提高RSA和ECC等公钥加密系统的效率。本文介绍了改进蒙哥马利乘法和电路结构。第一个修改的Montgomery乘法器使用4:2压缩器和进位保存加法器(CSA)来执行大单词长度的加法。与最近提出的最快算法的8XOR+1AND门相比，使用该方法进行单个模乘法的总延迟为7XOR+1 AND门。第二个改进的蒙哥马利乘法器使用了一种新颖的硬件单元，该单元以3XOR延迟输出4个输入操作数的进位保存表示。与最近提出的算法的6XOR+1AND门相比，使用新硬件单元的单模块乘法的总延迟为5XOR+1 AND门。本文还介绍了所提方法的最佳晶体管实现。提出的晶体管实现在面积、速度和低功耗方面进行了高度优化。当实现更高字长(如1024和2048)时，建议的Montgomery乘法电路将非常重要，因为与最近提出的最快算法相比，将分别节省1024和2048个异或门的传播延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Modified Montgomery modular multiplication using 4:2 compressor and CSA adder

The efficiency of the public key encryption systems like RSA and ECC can be improved with the adoption of a faster multiplication scheme. In this paper, Modified Montgomery multiplications and circuit architectures are presented. The first modified Montgomery multiplier uses 4:2 compressor and carry save adders (CSA) to perform large word length additions. The total delay for a single modular multiplication using the proposed approach is 7XOR+1 AND gate compared to 8XOR+1AND gate of the recently proposed fastest algorithm. The second modified Montgomery multiplier uses a novel proposed hardware unit that outputs carry save representation of the 4-input operands in 3XOR delays. The total delay for a single modular multiplication using the novel hardware unit is 5XOR+1 AND gate compared to 6XOR+1AND gate of the recently proposed algorithm. The optimal transistor implementations of the proposed approaches have also been presented. The proposed transistor implementations are highly optimized in terms of area, speed and low power. The proposed Montgomery multiplication circuit will be of eminent importance when implemented for higher word length such as 1024 and 2048 as there will be saving in the propagation delays by 1024 and 2048 XOR gates respectively compared to the recently proposed fastest algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA'06)

自引率

0.00%

发文量