Elsevier

Computers & Fluids

Volume 92, 20 March 2014, Pages 253-273
Computers & Fluids

Parallel preconditioners for the unsteady Navier–Stokes equations and applications to hemodynamics simulations

https://doi.org/10.1016/j.compfluid.2013.10.034Get rights and content

Abstract

We are interested in the numerical solution of the unsteady Navier–Stokes equations on large scale parallel architectures. We consider efficient preconditioners, such as the Pressure Convection-Diffusion (PCD), the Yosida preconditioner, the SIMPLE preconditioner, and the algebraic additive Schwarz preconditioner, for the linear systems arising from finite element discretizations using tetrahedral unstructured meshes and time advancing finite difference schemes. To achieve parallel efficiency, we introduce approximate versions of these preconditioners, based on their factorizations where each factor can be either inverted exactly or using an add-hoc preconditioner. We investigate their strong scalability for both classical benchmark problems and simulations relevant to hemodynamics, using up to 8192 cores.

Introduction

In this work, we propose efficient, optimal, scalable preconditioners for the unsteady Navier–Stokes equations. We use the Finite Element Method (FEM) for the space discretization, based on tetrahedral unstructured meshes, and an implicit time discretization. To solve the nonlinear problem, we resort to a semi-explicit treatment of the convection term. We therefore have to repetitively solve a system of the type Ax=b, where x is the unknown vector and both A and b change over the timesteps. This non-definite non-symmetric system is solved using preconditioned GMRES [1] or its variant FGMRES [2]. The efficiency of these iterative methods relies on the choice of a suitable preconditioner. A popular approach is to design a preconditioner based on an algebraic factorization of A which exploits its block structure, see, e.g., [3], [4].

Physically based methods like SIMPLE (first introduced in [5]) or its generalizations SIMPLEC and SIMPLER [4], and Yosida methods [6], [3] can be written as an approximate factorization of A. Other methods are obtained with algebraic manipulations of the blocks of A. This is the case of the Pressure Convection-Diffusion (PCD) preconditioner [7], [8], [9], based on an ideal preconditioner for which GMRES converges in at most two iterations [10]. Nevertheless, its implementation requires particular care.

Another option is to consider the Least-Squares Commutator (LSC) preconditioner (introduced in [11]), which can be built automatically, with higher computational cost, though. The convergence of LSC is independent of the mesh size and mildly dependent on the viscosity. A version for stabilized finite element discretizations has been introduced in [12]. More recently, the so-called Relaxed Dimensional Factorization (RDF) preconditioner has been introduced in [13] as an improvement to the Dimensional Splitting (DS) preconditioner [14]. Experimental results indicate independence of its convergence rate of the mesh size, and a mild dependence on the viscosity. The RDF deals quite well with stretched elements. These preconditioners usually offer suitable properties such as convergence independently of the domain discretization or robustness with respect to the variations on physical parameters. However, in general, the preconditioners above are not tested with more than 64 cores. Moreover, to the best of our knowledge these preconditioners are not designed and tested in the hemodynamics context.

Domain decomposition methods (DDMs) follow a different approach that is easier to parallelize; the main domain Ω is decomposed into several subdomains Ωi,i=1,,M. Then, the equations are solved iteratively on each subdomain with suitable boundary conditions depending on the neighboring domains to obtain the solution on the full domain Ω. In this work, we first consider the additive Schwarz preconditioner. The locality of the algorithm (i.e., each domain is solved separately) is responsible for a raise in the number of iterations as the number of subdomains increases. The preconditioner is then improved by adding a coarse domain solver. In this work, we will form the subdomains and the coarse solver using algebraic techniques [15], [16]. A quick review of DDM is available in e.g. [17] and, for a deeper understanding, the theory is presented in [18] or [19].

A further option relies on multigrid methods: instead of considering only one mesh/grid to solve a given problem, several levels of grids are used in coordination with some restriction and prolongation operators that transfer quantities from one level to another. The key idea behind multigrid is to smooth the error on the fine grids such that the remaining error can be well approximated on coarser grids, where the cost to compute the correction of such an error is cheaper. Multigrid is among the most efficient techniques for solving some specific partial differential equations (PDEs). We refer to [20] for an historical review of the development of multigrid over the last 30 years, which also includes many references, and to [21], [22] for the theory of multigrid.

Our goal is to design an efficient preconditioner to solve problems relevant to hemodynamics on parallel architectures. We first consider a benchmark proposed by Ethier and Steinman [23], as well as a flow over an obstruction problem similar to the one used in [4]. We then focus to the development of largely scalable algorithms to improve cardiovascular applications. In particular, we also want to use the preconditioners developed here to improve those for complex problems such as in fluid structure interaction [24], [25], [26]. For this reason, we introduce a new test case which represents a cerebral blood vessel with an aneurysm. In the future, numerical simulations may become a tool that helps to take decisions about medical treatments in addition to the usual, e.g., radiography or magnetic resonance imaging (MRI). Among the most critical aspects of simulations in the medical field is the huge differences in the cardiovascular system from one patient to the other, the short computational time to solution to be able to make a diagnosis in case of emergency procedures, and the multiscale aspect of the models (i.e., the cardiovascular system involves from large arteries to small capillaries). We refer to [27] for an overview of techniques related to the modeling and the simulations of the cardiovascular system. The medical industry may benefit from efficient and reliable techniques to run patient specific simulations to help diagnosis in a reasonable time frame. Reducing the time to solution and increasing the problem size may also enable the validation of the models for large medical simulations. Typical medical problems may require to solve systems of millions of unknowns at each timestep. The only way to tackle them is to use parallel supercomputers, which are continuously developed and improved. It is not trivial to take advantage of such machines, though.

In what follows, we define preconditioners based on different factorizations of A. In particular, we show that using “embedded” preconditioners to apply inverse of algebraic operators may lead to strongly scalable methods to solve the Navier–Stokes equations. We test our preconditioners using large problems and high number of processes (up to N=8192). This paper focus on how state of the art preconditioners can be ported to large parallel platform. In Section 2, we describe the mathematical model used to solve the Navier–Stokes equations. In Section 3, we recall the different preconditioners that will be used in this work. In Section 4, we introduce some qualities that matter in the development of preconditioners for High Performance Computing (HPC). A new preconditioning strategy based on approximate inverses is proposed in Sections 5 Approximate preconditioners for the Navier–Stokes equations, 6 Strategies for applying the algebraic operators. Section 7 presents the scalability results obtained with a Cray XE6. Finally, some conclusions are drawn in Section 8.

Section snippets

Mathematical model

The Navier–Stokes equations for an incompressible viscous fluid read:tu-νΔu+u·u+p=finΩ,t>0,·u=0inΩ,t>0,u=gDonΓD,t>0,νun-pn=gNonΓN,t>0,u=u0inΩ,t=0,where Ω is the fluid domain, ΓD and ΓN are the Dirichlet and Neumann parts of the boundary respectively, u is the fluid velocity, p the pressure, ν the kinematic viscosity of the fluid, f the external forces, and gD and gN are assigned functions.

Let us introduce HD1(Ω)={fH1(Ω)|f=0onΓD}, and let V denote the space HD1(Ω)3, and Q the space L2(Ω)

Preconditioners

In this section, we summarize how to form the core preconditioners that will be used in this work. We also comment briefly on their performance.

Preconditioners for high performance computing

Designing a preconditioner for High Performance Computing (HPC) requires special care; the preconditioner has to be efficient for the considered linear system without suffering of the parallelization required by supercomputers. We discuss in this section the must-have properties of a good preconditioner for HPC.

From now on, we use the term process to describe an atomic compute task of a parallel application. A parallel application consists of P processes. A process can be either a thread or an

Approximate preconditioners for the Navier–Stokes equations

In Section 4, we have seen that parallelism adds extra difficulties when designing good preconditioners. The additive Schwarz preconditioner perfectly illustrates this problematic situation (see Section 3.1) where as the number of domains grows (we affect one domain to each core) the convergence rate of GMRES deteriorates.

The inexact factorizations introduced in 3.2 SIMPLE preconditioner, 3.3 Yosida preconditioner, 3.4 Pressure Convection–Diffusion preconditioner are used to define new

Strategies for applying the algebraic operators

The efficiency of the preconditioners presented in Sections 5.1 Approximate SIMPLE preconditioner, 5.2 Approximate Yosida preconditioner, 5.3 Approximate Pressure Convection–Diffusion preconditioner relies on that of the embedded preconditioners. More precisely, we would like those preconditioners to enjoy the properties described in Section 4. To obtain efficient parallel algorithms we need to properly balance the workload among the different collaborative parallel tasks. At the beginning of

Numerical results

We test our preconditioners in three different cases: an analythical solution on a cube [23], an obstruction problem inspired from [4], and a cerebral aneurysm. Table 2 reports the characteristic and non-dimensional quantities used for each problem. In Section 7.5, we focus on testing the weak scalability of the preconditioners on a thoracic aorta.

We have chosen the parameters to cover different Reynolds numbers. Note that Δt is the same for the three benchmarks, but Δt spans over three orders

Conclusions

In this work, we focused on developing preconditioners suitable for high performance computing on parallel machines with many cores. We have paid specific attention to results relevant to hemodynamics simulations. Our preconditioners have been tested on three distinctive features: scalability, optimality, and robustness. We proposed a strategy to develop approximate preconditioners which are fast to build and to apply. We considered aSIMPLE, aYosida, and aPCD preconditioners. In all our

Acknowledgements

We acknowledge the European Research Council Advanced Grant “Mathcard, Mathematical Modelling and Simulation of the Cardiovascular System”, Project ERC-2008-AdG 227058, and the Swiss Platform for High-Performance and High-Productivity Computing (HP2C). This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project IDs h02 and s392. We gratefully acknowledge the CSCS for providing us the CPU resources for our simulations. Many thanks to Professors Andy

References (43)

  • Y. Saad

    A flexible inner-outer preconditioned GMRES algorithm

    SIAM J Sci Comput

    (1993)
  • D. Kay et al.

    A preconditioner for the steady-state Navier–Stokes equations

    SIAM J Sci Comput

    (2002)
  • H.C. Elman et al.

    Boundary conditions in approximate commutator preconditioners for the Navier–Stokes equations

    Electron Trans Numer Anal

    (2009)
  • M.F. Murphy et al.

    A note on preconditioning for indefinite linear systems

    SIAM J Sci Comput

    (2000)
  • H. Elman et al.

    Block preconditioners based on approximate commutators

    SIAM J Sci Comput

    (2006)
  • H. Elman et al.

    Least squares preconditioners for stabilized discretizations of the Navier–Stokes equations

    SIAM J Sci Comput

    (2007)
  • Karypis G, Schloegel K, Kumar V. METIS: a software package for partitioning unstructured graphs, partitioning meshes,...
  • Karypis G, Schloegel K, Kumar V. ParMETIS: parallel graph partitioning and sparse matrix ordering library. Tech rep,...
  • A. Quarteroni

    Numerical models for differential problems

    (2009)
  • A. Quarteroni et al.

    Domain decomposition methods for partial differential equations

    (1999)
  • A. Toselli et al.

    Domain decomposition methods—algorithms and theory

    (2005)
  • Cited by (31)

    • Sparse Aitken–Schwarz domain decomposition with application to Darcy flow

      2022, Computers and Fluids
      Citation Excerpt :

      Nowadays high-performance computers have several thousand cores and increasingly complex hierarchical communication networks. For these architectures, the use of a global reduction operation such as the dot product involved in the GMRES acceleration [1] can be a performance bottleneck (see for example the weak scaling results of the use of GMRES with different preconditioners in [2]). In this context, Restricted Additive Schwarz (RAS) [3] domain decomposition method as an iterative solver can be attractive for parallel computing since it leads to solving only local problems, and performing local communications between the neighboring subdomains to exchange boundary conditions at the subdomain’s interfaces (see for example [4] for recent use of Schwarz method on large scale computer).

    • The INTERNODES method for applications in contact mechanics and dedicated preconditioning techniques

      2022, Computers and Mathematics with Applications
      Citation Excerpt :

      Let us now return to the INTERNODES linear system (13), that is a nonsymmetric saddle point system. Such systems arise in an impressively large collection of scientific disciplines including fluid dynamics [26–28], optimization [29], electromagnetism [30–32] and contact mechanics [33,34,3] to name just a few. Due to the numerous underlying applications, they have been extensively studied by the scientific community and there exists abundant literature on the topic.

    • A computational model applied to myocardial perfusion in the human heart: From large coronaries to microvasculature

      2021, Journal of Computational Physics
      Citation Excerpt :

      The algebraic linear system associated to the fluid problem (7) is solved using the preconditioned GMRES. In particular, we consider the SIMPLE preconditioner in its approximated formulation (aSIMPLE) [19]. This is a convenient choice because, to compute the residual at each iteration of a Krylov method, each block of (12) can be considered as an independent mono-compartment Darcy problem which can be solved using an internal GMRES method and applying the aSIMPLE preconditioner adapted to the Darcy case.

    View all citing articles on Scopus
    View full text