We first investigate parallelization of Rubik's cube optimal solver, especially for acceleration by GPU. To examine its efficacy, we implement a simple solver based on Korf's algorithm, with which CPU and GPU collaborate in IDA* algorithm and a large number of GPU cores are utilized for speedup instead of a huge distance table used for pruning. Empirical studies succeeded to attain sufficient speedup by GPU-acceleration. There are many other similar puzzles of so-called permutation puzzles. The puzzle solving, i.e., restoring the original ordered state from a scrambled one is equivalent to the path-finding in the Cayley graph of the permutation group. We generalize the method used for Rubik's cube to much smaller problems, and examine its efficacy. The focus of our research interest is how efficient the parallel path-finding can be and whether the use of a large number of cores substitutes for a large distance table used for pruning, in general.
{"title":"GPU-acceleration of optimal permutation-puzzle solving","authors":"Hayakawa Hiroki, Ishida Naoaki, M. Hirokazu","doi":"10.1145/2790282.2790289","DOIUrl":"https://doi.org/10.1145/2790282.2790289","url":null,"abstract":"We first investigate parallelization of Rubik's cube optimal solver, especially for acceleration by GPU. To examine its efficacy, we implement a simple solver based on Korf's algorithm, with which CPU and GPU collaborate in IDA* algorithm and a large number of GPU cores are utilized for speedup instead of a huge distance table used for pruning. Empirical studies succeeded to attain sufficient speedup by GPU-acceleration. There are many other similar puzzles of so-called permutation puzzles. The puzzle solving, i.e., restoring the original ordered state from a scrambled one is equivalent to the path-finding in the Cayley graph of the permutation group. We generalize the method used for Rubik's cube to much smaller problems, and examine its efficacy. The focus of our research interest is how efficient the parallel path-finding can be and whether the use of a large number of cores substitutes for a large distance table used for pruning, in general.","PeriodicalId":384227,"journal":{"name":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128532329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To interpolate a supersparse polynomial with integer coefficients, two alternative approaches are the Prony-based "big prime" technique, which acts over a single large finite field, or the more recently-proposed "small primes" technique, which reduces the unknown sparse polynomial to many low-degree dense polynomials. While the latter technique has not yet reached the same theoretical efficiency as Prony-based methods, it has an obvious potential for parallelization. We present a heuristic "small primes" interpolation algorithm and report on a low-level C implementation using FLINT and MPI.
{"title":"Parallel sparse interpolation using small primes","authors":"Mohamed Khochtali, Daniel S. Roche, Xisen Tian","doi":"10.1145/2790282.2790290","DOIUrl":"https://doi.org/10.1145/2790282.2790290","url":null,"abstract":"To interpolate a supersparse polynomial with integer coefficients, two alternative approaches are the Prony-based \"big prime\" technique, which acts over a single large finite field, or the more recently-proposed \"small primes\" technique, which reduces the unknown sparse polynomial to many low-degree dense polynomials. While the latter technique has not yet reached the same theoretical efficiency as Prony-based methods, it has an obvious potential for parallelization. We present a heuristic \"small primes\" interpolation algorithm and report on a low-level C implementation using FLINT and MPI.","PeriodicalId":384227,"journal":{"name":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115951414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dereje Kifle Boku, C. Fieker, W. Decker, Andreas Steenpaß
Although Buchberger's algorithm, in theory, allows us to compute Gröbner bases over any field, in practice, however, the computational efficiency depends on the arithmetic of the ground field. Consider a field K = Q(α), a simple extension of Q, where α is an algebraic number, and let f ∈ Q[t] be the minimal polynomial of α. In this paper we present a new efficient method to compute Gröbner bases in polynomial rings over the algebraic number field K. Starting from the ideas of Noro [11], we proceed by joining f to the ideal to be considered, adding t as an extra variable. But instead of avoiding superfluous S-pair reductions by inverting algebraic numbers, we achieve the same goal by applying modular methods as in [2, 3, 10], that is, by inferring information in characteristic zero from information in characteristic p > 0. For suitable primes p, the minimal polynomial f is reducible over Fp. This allows us to apply modular methods once again, on a second level, with respect to the factors of f. The algorithm thus resembles a divide and conquer strategy and is in particular easily parallelizable. At current state, the algorithm is probabilistic in the sense that, as for other modular Gröbner basis computations, an effective final verification test is only known for homogeneous ideals or for local monomial orderings. The presented timings show that for most examples, our algorithm, which has been implemented in Singular [7], outperforms other known methods by far.
{"title":"Gröbner bases over algebraic number fields","authors":"Dereje Kifle Boku, C. Fieker, W. Decker, Andreas Steenpaß","doi":"10.1145/2790282.2790284","DOIUrl":"https://doi.org/10.1145/2790282.2790284","url":null,"abstract":"Although Buchberger's algorithm, in theory, allows us to compute Gröbner bases over any field, in practice, however, the computational efficiency depends on the arithmetic of the ground field. Consider a field K = Q(α), a simple extension of Q, where α is an algebraic number, and let f ∈ Q[t] be the minimal polynomial of α. In this paper we present a new efficient method to compute Gröbner bases in polynomial rings over the algebraic number field K. Starting from the ideas of Noro [11], we proceed by joining f to the ideal to be considered, adding t as an extra variable. But instead of avoiding superfluous S-pair reductions by inverting algebraic numbers, we achieve the same goal by applying modular methods as in [2, 3, 10], that is, by inferring information in characteristic zero from information in characteristic p > 0. For suitable primes p, the minimal polynomial f is reducible over Fp. This allows us to apply modular methods once again, on a second level, with respect to the factors of f. The algorithm thus resembles a divide and conquer strategy and is in particular easily parallelizable. At current state, the algorithm is probabilistic in the sense that, as for other modular Gröbner basis computations, an effective final verification test is only known for homogeneous ideals or for local monomial orderings. The presented timings show that for most examples, our algorithm, which has been implemented in Singular [7], outperforms other known methods by far.","PeriodicalId":384227,"journal":{"name":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116844111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Numerical continuation methods track a solution path defined by a homotopy. The systems we consider are defined by polynomials in several variables with complex coefficients. For larger dimensions and degrees, the numerical conditioning worsens and hardware double precision becomes often insufficient to reach the end of the solution path. With double double and quad double arithmetic, we can solve larger problems that we could not solve with hardware double arithmetic, but at a higher computational cost. This cost overhead can be compensated by acceleration on a Graphics Processing Unit (GPU). We describe our implementation and report on computational results on benchmark polynomial systems.
{"title":"Accelerating polynomial homotopy continuation on a graphics processing unit with double double and quad double arithmetic","authors":"J. Verschelde, Xiangcheng Yu","doi":"10.1145/2790282.2790294","DOIUrl":"https://doi.org/10.1145/2790282.2790294","url":null,"abstract":"Numerical continuation methods track a solution path defined by a homotopy. The systems we consider are defined by polynomials in several variables with complex coefficients. For larger dimensions and degrees, the numerical conditioning worsens and hardware double precision becomes often insufficient to reach the end of the solution path. With double double and quad double arithmetic, we can solve larger problems that we could not solve with hardware double arithmetic, but at a higher computational cost. This cost overhead can be compensated by acceleration on a Graphics Processing Unit (GPU). We describe our implementation and report on computational results on benchmark polynomial systems.","PeriodicalId":384227,"journal":{"name":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122992724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","authors":"","doi":"10.1145/2790282","DOIUrl":"https://doi.org/10.1145/2790282","url":null,"abstract":"","PeriodicalId":384227,"journal":{"name":"Proceedings of the 2015 International Workshop on Parallel Symbolic Computation","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115639614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}