Author
Abstract
This work showcases that simply by using a few standard OpenMP compiler directives and without any major source code modifications, simulations of both steady and unsteady incompressible fluid flows by the implicit stiffly stable time-stepping projection method could easily be ported to run in parallel on any shared-memory, multicore systems with reasonably good computation performance. The shared-memory, multicore architecture is the state-of-the-art hardware environment for modern microprocessors, including computation nodes of supercomputers, workstation, and personal desktop and laptop. This work has modified the original semi-implicit, stiffly stable time-stepping projection method to be fully implicit. Following the modified method, the approximation of incompressible Navier Stokes (INS) equations is broken into a sequence of steps. Each step consists of three stages, dealing with advection, pressure and viscosity of the INS equations respectively. Each stage of every individual step is governed by its own governing equation(s), physical discretization of which transforms the partial-differential equation(s) to a set of linear algebraic equations, which in general matrix form are Ax=b. The ILU(0) preconditioner and the generalized minimum residual (GMRES) algorithm are used for the preconditioning and solution of Ax=b, respectively. Using Kovasznay and Pearson vortex flows as reference problems for steady and unsteady incompressible flows, respectively, the profiling tests have demonstrated that for both flows, more than 99% of the whole simulation time is spent on the production of ILU(0) preconditioner and GMRES implementation, which are then singled out to be OpenMP parallelized. Benchmarked on a computation node of 16 CPU cores, it has been demonstrated that mean parallel efficiencies of 72% and 73% have been achieved for simulations of Kovasznay and Pearson vortex flows, respectively. When all 16 CPU cores are used, simulations of Kovaszanay and Pearson vortex flows have been accelerated by a factor of 8.0 and 8.17, respectively. Performance and efficiency of OpenMP parallel implementation depend on the hardware system as well. Running the Kovasznay flow simulation on a 12 CPU-core workstation, the mean parallel efficiency is 89%, in contrast to the mean parallel efficiency of 72% that has been achieved on the 16 CPU-core computation node.
Suggested Citation
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:apmaco:v:355:y:2019:i:c:p:238-252. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/applied-mathematics-and-computation .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.