An Explicit Algorithm for Porous Media Flow Simulation using GPUs
Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region, Russia
Proceedings of the Second International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering, 2011
@article{morozov2011explicit,
title={An Explicit Algorithm for Porous Media Flow Simulation using GPUs},
author={Morozov, DN and Chetverushkin, BN and Churbanova, NG and Trapeznikova, MA},
year={2011}
}
The proposed approach is aimed at implementation by explicit difference schemes having a simple structure. By the analogy with the kinetically-consistent finite difference schemes and the quasi-gas dynamic system of equations [1,2] the classical model of slightly compressible fluid flows in porous media is modified taking into account the minimal scales of averaging on space and on time. As a result the regularizing term and the time derivation of the second order with small parameters are present in the continuity equation. Transformation of this equation from the parabolic to hyperbolic type provides the sufficient scheme stability. In comparison with traditional approaches this approach increases the time step and reduces substantially computational costs. The infiltration of dense non-aqueous phase liquid (e.g. tetrachloroethylene) into a reservoir filled with fully water saturated sand is used as a test problem for verification of the algorithm developed while running on GPUs. The data partitioning strategy is chosen to parallelize the algorithm. Computations are performed with double precision on the hybrid cluster MVS-Express. Each node of the cluster includes four-core CPU (AMD Opteron 2.6 GHz) and a graphics accelerator NVIDIA GeForce 295GTX supporting the CUDA technology. The original library Shmem-Express has been developed in the Keldysh Institute of Applied Mathematics. Its distinguishing feature is the single-ended data exchange between the nodes. To estimate efficiency of GPU employment the running time on a single CPU core has been compared with the running time on a single graphics card and on four cards belonging to different nodes. In the first case the highest speedup is achieved at the grid size 1024×1024 and comes to 48 times. In the second case at the grid of 12288×2048 it comes to 82 times.
November 24, 2011 by hgpu