To access the full text documents, please follow this link: http://hdl.handle.net/2117/13326

Hybrid MPI+OpenMP parallelization of an FFT-based 3D Poisson solver with one periodic direction
Gorobets, Andrei; Trias Miquel, Francesc Xavier; Borrell Pol, Ricard; Lehmkuhl Barba, Oriol; Oliva Llena, Asensio
Universitat Politècnica de Catalunya. Departament de Màquines i Motors Tèrmics; Universitat Politècnica de Catalunya. CTTC - Centre Tecnològic de la Transferència de Calor
This work is devoted to the development of efficient parallel algorithms for the direct numerical simulation (DNS) of incompressible flows on modern supercomputers. In doing so, a Poisson equation needs to be solved at each time-step to project the velocity field onto a divergence-free space. Due to the non-local nature of its solution, this elliptic system is the part of the algorithm that is most difficult to parallelize. The Poisson solver presented here is restricted to problems with one uniform periodic direction. It is a combination of a block preconditioned Conjugate Gradient (PCG) and an FFT diagonalization. The latter decomposes the original system into a set of mutually independent 2D systems that are solved by means of the PCG algorithm. For the most ill-conditioned systems, that correspond to the lowest Fourier frequencies, the PCG is replaced by a direct Schur-complement based solver. The previous version of the Poisson solver was conceived for single-core (also dual-core) processors and therefore, the distributed memory model with message-passing interface (MPI) was used. The irruption of multi-core architectures motivated the use of a two-level hybrid MPI + OpenMP parallelization with the shared memory model on the second level. Advantages and implementation details for the additional OpenMP parallelization are presented and discussed in this paper. Numerical experiments show that, within its range of efficient scalability, the previous MPI-only parallelization is slightly outperformed by the MPI + OpenMP approach. But more importantly, the hybrid parallelization has allowed to significantly extend the range of efficient scalability. Here, the solver has been successfully tested up to 12800 CPU cores for meshes with up to 109 grid points. However, estimations based on the presented results show that this range can be potentially stretched up until 200,000 cores approximately. Finally, several examples of DNS simulations are briefly presented to illustrate some potential applications of the solver.
Peer Reviewed
-Àrees temàtiques de la UPC::Física::Termodinàmica::Física de la transmissió de la calor
-Àrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Aplicacions informàtiques a la física i l‘enginyeria
-Poisson solver
-Direct numerical simulation
-Schur complement
-Supercomputers -- Mathematical models
-Supercomputadors
Article - Published version
Article
         

Show full item record

Related documents

Other documents of the same author

Gorobets, Andrei; Trias Miquel, Francesc Xavier; Borrell Pol, Ricard; Soria Guerrero, Manel; Oliva Llena, Asensio
Gorobets, Andrei; Trias Miquel, Francesc Xavier; Borrell Pol, Ricard; Soria Guerrero, Manel; Oliva Llena, Asensio
Gorobets, Andrei; Borrell Pol, Ricard; Trias Miquel, Francesc Xavier; Kozubskaya, T.K.; Oliva Llena, Asensio
Gorobets, Andrei; Trias Miquel, Francesc Xavier; Borrell Pol, Ricard; Oyarzun Altamirano, Guillermo; Oliva Llena, Asensio
Borrell Pol, Ricard; Lehmkuhl Barba, Oriol; Trias Miquel, Francesc Xavier; Oyarzún, G.; Oliva Llena, Asensio
 

Coordination

 

Supporters