{"title":"并行整数乘法","authors":"Vivien Samuel","doi":"10.1109/pdp55904.2022.00024","DOIUrl":null,"url":null,"abstract":"Multiplication is a fundamental step in many algorithms. If the multiplication of two integers of n words has a complexity of M(n), divisions and squares can be computed in O(M(n)) as well and the greatest common divisor can be computed in O(M(n)logn). Thus being able to have a small value for M(n) is extremely important.To this day, the best known algorithm for reachable values is the Schönhage-Strassen algorithm which is implemented by a few arithmetic libraries. Asymptotically faster algorithms exist, however no computer is able to hold numbers big enough for those algorithms to outrun Schönhage-Strassen.The GNU Multiple Precision (GMP) library has a sequential-only implementation of Schönhage-Strassen.However some algorithms contains a step which is a single big multiplication. Thus when trying to parallelize such an algorithm, one requires a parallel algorithm for multiplication. An example of such an algorithm is the batch factorization for Number Field Sieve. Thus people trying to implement a parallel version of such algorithms need to find an arithmetic library that implements a parallel integer multiplication.An example of such a library is the Flint (Fast LIbrary for Number Theory) library that contains a parallel implementation of Schönhage-Strassen. In this article we present an implementation of Schönhage-Strassen, that reaches a speedup of 20 for the multiplication of two integers of 107 words of 64 bits using a Xeon Gold with 32 cores.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"156 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parallel integer multiplication\",\"authors\":\"Vivien Samuel\",\"doi\":\"10.1109/pdp55904.2022.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiplication is a fundamental step in many algorithms. If the multiplication of two integers of n words has a complexity of M(n), divisions and squares can be computed in O(M(n)) as well and the greatest common divisor can be computed in O(M(n)logn). Thus being able to have a small value for M(n) is extremely important.To this day, the best known algorithm for reachable values is the Schönhage-Strassen algorithm which is implemented by a few arithmetic libraries. Asymptotically faster algorithms exist, however no computer is able to hold numbers big enough for those algorithms to outrun Schönhage-Strassen.The GNU Multiple Precision (GMP) library has a sequential-only implementation of Schönhage-Strassen.However some algorithms contains a step which is a single big multiplication. Thus when trying to parallelize such an algorithm, one requires a parallel algorithm for multiplication. An example of such an algorithm is the batch factorization for Number Field Sieve. Thus people trying to implement a parallel version of such algorithms need to find an arithmetic library that implements a parallel integer multiplication.An example of such a library is the Flint (Fast LIbrary for Number Theory) library that contains a parallel implementation of Schönhage-Strassen. In this article we present an implementation of Schönhage-Strassen, that reaches a speedup of 20 for the multiplication of two integers of 107 words of 64 bits using a Xeon Gold with 32 cores.\",\"PeriodicalId\":210759,\"journal\":{\"name\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"volume\":\"156 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/pdp55904.2022.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/pdp55904.2022.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
乘法是许多算法的基本步骤。如果n个单词的两个整数的乘法的复杂度为M(n),那么除法和平方也可以在O(M(n))中计算出来,最大公约数可以在O(M(n)logn)中计算出来。因此,M(n)的小值是非常重要的。到目前为止,最著名的可达值算法是Schönhage-Strassen算法,它由几个算法库实现。渐近更快的算法是存在的,但是没有计算机能够容纳足够大的数字,使这些算法超过Schönhage-Strassen。GNU多精度(GMP)库提供了一个仅顺序实现的Schönhage-Strassen。然而,一些算法包含一个步骤,这是一个大的乘法。因此,当试图并行化这样一个算法时,需要一个并行的乘法算法。这种算法的一个例子是Number Field Sieve的批量分解。因此,试图实现这种算法的并行版本的人需要找到一个实现并行整数乘法的算术库。此类库的一个示例是Flint (Number Theory的快速库)库,它包含Schönhage-Strassen的并行实现。在本文中,我们介绍了一个Schönhage-Strassen的实现,使用32核的Xeon Gold,对于两个64位107字整数的乘法,其加速速度达到20。
Multiplication is a fundamental step in many algorithms. If the multiplication of two integers of n words has a complexity of M(n), divisions and squares can be computed in O(M(n)) as well and the greatest common divisor can be computed in O(M(n)logn). Thus being able to have a small value for M(n) is extremely important.To this day, the best known algorithm for reachable values is the Schönhage-Strassen algorithm which is implemented by a few arithmetic libraries. Asymptotically faster algorithms exist, however no computer is able to hold numbers big enough for those algorithms to outrun Schönhage-Strassen.The GNU Multiple Precision (GMP) library has a sequential-only implementation of Schönhage-Strassen.However some algorithms contains a step which is a single big multiplication. Thus when trying to parallelize such an algorithm, one requires a parallel algorithm for multiplication. An example of such an algorithm is the batch factorization for Number Field Sieve. Thus people trying to implement a parallel version of such algorithms need to find an arithmetic library that implements a parallel integer multiplication.An example of such a library is the Flint (Fast LIbrary for Number Theory) library that contains a parallel implementation of Schönhage-Strassen. In this article we present an implementation of Schönhage-Strassen, that reaches a speedup of 20 for the multiplication of two integers of 107 words of 64 bits using a Xeon Gold with 32 cores.