{"title":"Unified Digit Serial Systolic Montgomery Multiplication Architecture for Special Classes of Polynomials over GF(2m)","authors":"S. Talapatra, H. Rahaman, Samir K. Saha","doi":"10.1109/DSD.2010.59","DOIUrl":null,"url":null,"abstract":"This paper presents an unified digit-serial systolic multiplication architecture for all-one polynomials (AOP) and trinomial over GF (2m) for efficient implementation of Montgomery Multiplication (MM) algorithm suitable for cryptosystem. This is the first reported unified digit serial systolic digit level pipelined MM architecture for AOP and trinomials over GF (2). Analysis shows that the latency and circuit complexity of the proposed architecture are significantly less compared to earlier design for same class of polynomials. The proposed multiplier has clock cycle latency of (2N) where N=ém/Lù, m is the word size and L is the digit size.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD.2010.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
This paper presents an unified digit-serial systolic multiplication architecture for all-one polynomials (AOP) and trinomial over GF (2m) for efficient implementation of Montgomery Multiplication (MM) algorithm suitable for cryptosystem. This is the first reported unified digit serial systolic digit level pipelined MM architecture for AOP and trinomials over GF (2). Analysis shows that the latency and circuit complexity of the proposed architecture are significantly less compared to earlier design for same class of polynomials. The proposed multiplier has clock cycle latency of (2N) where N=ém/Lù, m is the word size and L is the digit size.