Parallel Computation of Echelon Forms

Dumas, Jean-Guillaume; Gautier, Thierry; Pernet, Clément; Sultan, Ziad

doi:10.1007/978-3-319-09873-9_42

Parallel Computation of Echelon Forms

Jean-Guillaume Dumas¹⁶,
Thierry Gautier¹⁷,
Clément Pernet¹⁸ &
…
Ziad Sultan^16,17

Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8632))

Abstract

We propose efficient parallel algorithms and implementations on shared memory architectures of LU factorization over a finite field. Compared to the corresponding numerical routines, we have identified three main specifities of linear algebra over finite fields. First, the arithmetic complexity could be dominated by modular reductions. Therefore, it is mandatory to delay as much as possible these reductions while mixing fine-grain parallelizations of tiled iterative and recursive algorithms. Second, fast linear algebra variants, e.g., using Strassen-Winograd algorithm, never suffer from instability and can thus be widely used in cascade with the classical algorithms. There, trade-offs are to be made between size of blocks well suited to those fast variants or to load and communication balancing. Third, many applications over finite fields require the rank profile of the matrix (quite often rank deficient) rather than the solution to a linear system. It is thus important to design parallel algorithms that preserve and compute this rank profile. Moreover, as the rank profile is only discovered during the algorithm, block size has then to be dynamic. We propose and compare several block decompositions: tile iterative with left-looking, right-looking and Crout variants, slab and tile recursive. Experiments demonstrate that the tile recursive variant performs better and matches the performance of reference numerical software when no rank deficiency occurs. Furthermore, even in the most heterogeneous case, namely when all pivot blocks are rank deficient, we show that it is possbile to maintain a high efficiency.

This work is partly funded by the HPAC project of the French Agence Nationale de la Recherche (ANR 11 BS02 013).

Download to read the full chapter text

Chapter PDF

References

Broquedis, F., Gautier, T., Danjean, V.: libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 102–115. Springer, Heidelberg (2012)
Chapter Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Computing 35(1), 38–53 (2009), http://dx.doi.org/10.1016/j.parco.2008.10.002
Article MathSciNet Google Scholar
Dongarra, J.J., Duff, L.S., Sorensen, D.C., Vorst, H.A.V.: Numerical Linear Algebra for High Performance Computers. SIAM (1998)
Google Scholar
Dongarra, J.J., Faverge, M., Ltaief, H., Luszczek, P.: Achieving numerical accuracy and high performance using recursive tile LU factorization. Concurrency and Computation: Practice and Experience 26(7), 1408–1431 (2014), http://hal.inria.fr/hal-00809765
Article Google Scholar
Dumas, J.-G., Giorgi, P., Pernet, C.: Dense linear algebra over prime fields. ACM TOMS 35(3), 1–42 (2008), http://arxiv.org/abs/cs/0601133
Article MathSciNet Google Scholar
Dumas, J.-G., Pernet, C., Sultan, Z.: Simultaneous computation of the row and column rank profiles. In: Kauers, M. (ed.) Proc. ISSAC 2013, Grenoble, France, pp. 181–188. ACM Press, New York (2013)
Google Scholar
Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases (F4). Journal of Pure and Applied Algebra 139(1–3), 61–88 (1999)
Article MathSciNet MATH Google Scholar
Gathen, J.V., Gerhard, J.: Modern Computer Algebra. Cambridge University Press, New York (1999)
MATH Google Scholar
Gustavson, F.G.: Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM Journal of Research and Development 41(6), 737–756 (1997)
Article Google Scholar
Jeannerod, C.-P., Pernet, C., Storjohann, A.: Rank-profile revealing Gaussian elimination and the CUP matrix decomposition. J. Symb. Comp. 56, 46–68 (2013)
Article MathSciNet MATH Google Scholar
Klimkowski, K., van de Geijn, R.A.: Anatomy of a parallel out-of-core dense linear solver. In: ICPP, vol. 3, pp. 29–33. CRC Press (August 1995)
Google Scholar
Kurzak, J., Ltaief, H., Dongarra, J., Badia, R.M.: Scheduling dense linear algebra operations on multicore processors. Concurrency and Computation: Practice and Experience 22(1), 15–44 (2010)
Article Google Scholar
Stein, W.: Modular forms, a computational approach. Graduate studies in mathematics. AMS (2007), http://wstein.org/books/modform/modform
Toledo, S.: Locality of reference in lu decomposition with partial pivoting. SIAM Journal on Matrix Analysis and Applications 18(4), 1065–1081 (1997)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

LJK-CASYS, UJF, CNRS, Inria, G’INP, UPMF, Grenoble, France
Jean-Guillaume Dumas & Ziad Sultan
LIG-MOAIS UJF, CNRS, Inria, G’INP, UPMF, Grenoble, France
Thierry Gautier & Ziad Sultan
LIP-AriC UJF, CNRS, Inria, UCBL, ÉNS de Lyon, France
Clément Pernet

Authors

Jean-Guillaume Dumas
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Gautier
View author publications
You can also search for this author in PubMed Google Scholar
Clément Pernet
View author publications
You can also search for this author in PubMed Google Scholar
Ziad Sultan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CRACS/INESC-TEC and FCUP, Universidade do Porto, Rua do Campo Alegre, 1021, 4169-007, Porto, Portugal
Fernando Silva , Inês Dutra & Vítor Santos Costa , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dumas, JG., Gautier, T., Pernet, C., Sultan, Z. (2014). Parallel Computation of Echelon Forms. In: Silva, F., Dutra, I., Santos Costa, V. (eds) Euro-Par 2014 Parallel Processing. Euro-Par 2014. Lecture Notes in Computer Science, vol 8632. Springer, Cham. https://doi.org/10.1007/978-3-319-09873-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-09873-9_42
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09872-2
Online ISBN: 978-3-319-09873-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics