Pub Date : 2021-06-01DOI: 10.1109/ARITH51176.2021.00035
Rami Elkhatib, R. Azarderakhsh, Mehran Mozaffari Kermani
Software implementations of cryptographic algorithms are slow but highly flexible and relatively easy to implement. On the other hand, hardware implementations are usually faster but provide little flexibility and require a lot of time to implement efficiently. In this paper, we develop a hybrid software-hardware implementation of the third round of Supersingular Isogeny Key Encapsulation (SIKE), a post-quantum cryptography algorithm candidate for NIST. We implement an isogeny field accelerator for the hardware and integrate it with a RISC-V processor which also acts as the main control unit for the field accelerator. The main advantage of this design is the high performance gain from the hardware implementation and the flexibility and fast development the software implementation provides. This is the first hybrid RISC-V and accelerator of SIKE. Furthermore, we provide one implementation for all NIST security levels of SIKE. Our design has the best area-time at NIST security levels 3 and 5 out of all hardware and hybrid designs provided in the literature.
{"title":"Accelerated RISC-V for SIKE","authors":"Rami Elkhatib, R. Azarderakhsh, Mehran Mozaffari Kermani","doi":"10.1109/ARITH51176.2021.00035","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00035","url":null,"abstract":"Software implementations of cryptographic algorithms are slow but highly flexible and relatively easy to implement. On the other hand, hardware implementations are usually faster but provide little flexibility and require a lot of time to implement efficiently. In this paper, we develop a hybrid software-hardware implementation of the third round of Supersingular Isogeny Key Encapsulation (SIKE), a post-quantum cryptography algorithm candidate for NIST. We implement an isogeny field accelerator for the hardware and integrate it with a RISC-V processor which also acts as the main control unit for the field accelerator. The main advantage of this design is the high performance gain from the hardware implementation and the flexibility and fast development the software implementation provides. This is the first hybrid RISC-V and accelerator of SIKE. Furthermore, we provide one implementation for all NIST security levels of SIKE. Our design has the best area-time at NIST security levels 3 and 5 out of all hardware and hybrid designs provided in the literature.","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125021404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ARITH51176.2021.00029
Andreas Böttcher, M. Kumm, F. D. Dinechin
This proposal presents the resource optimal design of truncated multipliers targeting field programmable gate arrays (FPGAs). In contrast to application specific integrated circuits (ASICs), the design for FPGAs has some distinct design challenges due to many possibilities of computing the partial products using logic-based or DSP-based sub-multipliers. To tackle this, we extend a previously proposed tiling methodology which translates the multiplier design into a geometrical problem: the target multiplier is represented by a board that has to be covered by tiles representing the sub-multipliers. The tiling with the least resources can be found with integer linear programming (ILP). Our extension considers the error of possibly unoccupied positions of the board and determines the tiling with the least resources that respects the maximal allowed error bound. This error bound is chosen such that a faithfully rounded truncated multiplier is obtained. Compared to previous designs that use a fixed number of guard bits or optimize at the level of the dot diagrams, this allows a much better use of sub-multipliers resulting in significant area savings without sacrificing the timing.
{"title":"Resource Optimal Truncated Multipliers for FPGAs","authors":"Andreas Böttcher, M. Kumm, F. D. Dinechin","doi":"10.1109/ARITH51176.2021.00029","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00029","url":null,"abstract":"This proposal presents the resource optimal design of truncated multipliers targeting field programmable gate arrays (FPGAs). In contrast to application specific integrated circuits (ASICs), the design for FPGAs has some distinct design challenges due to many possibilities of computing the partial products using logic-based or DSP-based sub-multipliers. To tackle this, we extend a previously proposed tiling methodology which translates the multiplier design into a geometrical problem: the target multiplier is represented by a board that has to be covered by tiles representing the sub-multipliers. The tiling with the least resources can be found with integer linear programming (ILP). Our extension considers the error of possibly unoccupied positions of the board and determines the tiling with the least resources that respects the maximal allowed error bound. This error bound is chosen such that a faithfully rounded truncated multiplier is obtained. Compared to previous designs that use a fixed number of guard bits or optimize at the level of the dot diagrams, this allows a much better use of sub-multipliers resulting in significant area savings without sacrificing the timing.","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114364330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for Stochastically Rounded Elementary Arithmetic Operations in IEEE 754 Floating-Point Arithmetic","authors":"M. Fasi, M. Mikaitis","doi":"10.1109/ARITH51176.2021.00024","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00024","url":null,"abstract":"Published in \"IEEE Transactions on Emerging Topics in Computing, Volume: 9, Issue: 3, JulySeptember 2021\" and orally presented at ARITH 2021.","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132889302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ARITH51176.2021.00032
F. D. Dinechin, Silviu-Ioan Filip, M. Kumm, Anastasia Volkova
A hardware implementation can be defined to be faithful to the frequency specification of a linear time-invariant digital filter. Filter design and implementation then become a single global optimisation problem. To solve this problem, existing tools are reviewed, and the missing ones are framed.
{"title":"Towards Arithmetic-Centered Filter Design","authors":"F. D. Dinechin, Silviu-Ioan Filip, M. Kumm, Anastasia Volkova","doi":"10.1109/ARITH51176.2021.00032","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00032","url":null,"abstract":"A hardware implementation can be defined to be faithful to the frequency specification of a linear time-invariant digital filter. Filter design and implementation then become a single global optimisation problem. To solve this problem, existing tools are reviewed, and the missing ones are framed.","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114587917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/arith51176.2021.00010
{"title":"Industry Panel ARITH 2021: Processors for the Computing of the 2020s","authors":"","doi":"10.1109/arith51176.2021.00010","DOIUrl":"https://doi.org/10.1109/arith51176.2021.00010","url":null,"abstract":"","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126084966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ARITH51176.2021.00014
Nestor Demeure, C. Chevalier, C. Denis, P. Dossantos-Uzarralde
Extensive work has been done to evaluate the numerical accuracy of computations. However, getting fine-grained information on the operations that caused the inaccuracies observed in a given output is still a hard problem. We propose a new method, under the name tagged error, to get fine information on the impact of user-defined code sections on the numerical error of any floating-point number in a program. Our method uses a dedicated arithmetic over a type that encapsulates both the result the user would have had with the original computation and an approximation of its numerical error stored as an unevaluated sum of terms that can each be attributed to a single source. It lets us quantify the impact of potential error sources on any output of a computation while taking phenomena such as error amplification or dampening, due to later operations, into account. Furthermore, we can use this information to do targeted modifications of an algorithm, improving both its speed and precision, as illustrated by a study on the conjugate gradient algorithm.
{"title":"Tagged error: tracing numerical error through computations","authors":"Nestor Demeure, C. Chevalier, C. Denis, P. Dossantos-Uzarralde","doi":"10.1109/ARITH51176.2021.00014","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00014","url":null,"abstract":"Extensive work has been done to evaluate the numerical accuracy of computations. However, getting fine-grained information on the operations that caused the inaccuracies observed in a given output is still a hard problem. We propose a new method, under the name tagged error, to get fine information on the impact of user-defined code sections on the numerical error of any floating-point number in a program. Our method uses a dedicated arithmetic over a type that encapsulates both the result the user would have had with the original computation and an approximation of its numerical error stored as an unevaluated sum of terms that can each be attributed to a single source. It lets us quantify the impact of potential error sources on any output of a computation while taking phenomena such as error amplification or dampening, due to later operations, into account. Furthermore, we can use this information to do targeted modifications of an algorithm, improving both its speed and precision, as illustrated by a study on the conjugate gradient algorithm.","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114764258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ARITH51176.2021.00015
Jean-Michel Muller
Expressions such as $ax^{2}, axy$, or $ax^{3}$, where $a$ is a constant, are not unfrequent in computing. There are several ways of parenthesizing them (and therefore, choosing the order of evaluation). Depending on the value of $a$, is there a more accurate evaluation order? We discuss this point (with a small digression on spurious underflows and overflows).
{"title":"$a cdot(xcdot x)$ or $(acdot x)cdot x?$","authors":"Jean-Michel Muller","doi":"10.1109/ARITH51176.2021.00015","DOIUrl":"https://doi.org/10.1109/ARITH51176.2021.00015","url":null,"abstract":"Expressions such as <tex>$ax^{2}, axy$</tex>, or <tex>$ax^{3}$</tex>, where <tex>$a$</tex> is a constant, are not unfrequent in computing. There are several ways of parenthesizing them (and therefore, choosing the order of evaluation). Depending on the value of <tex>$a$</tex>, is there a more accurate evaluation order? We discuss this point (with a small digression on spurious underflows and overflows).","PeriodicalId":383803,"journal":{"name":"2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126682707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}