On the GPGPU parallelization issues of finite element approximate inverse preconditioning
Department of Electrical & Computer Engineering, School of Engineering, Democritus University of Thrace, University Campus, Kimmeria, GR 67100 Xanthi, Greece
Journal of Computational and Applied Mathematics (July 2011)
During the last decades, explicit finite element approximate inverse preconditioning methods have been extensively used for efficiently solving sparse linear systems on multiprocessor systems. The effectiveness of explicit approximate inverse preconditioning schemes relies on the use of efficient preconditioners that are close approximants to the coefficient matrix and are fast to compute in parallel. New parallel computational techniques are proposed for the parallelization of the Optimized Banded Generalized Approximate Inverse Finite Element Matrix (OBGAIFEM) algorithm, based on the concept of the "fish bone" computational approach, and for the Explicit Preconditioned Conjugate Gradient type methods on a General Purpose Graphics Processing Unit (GPGPU). The proposed parallel methods have been implemented using Compute Unified Device Architecture (CUDA) developed by NVIDIA. Finally, numerical results for the performance of the finite element explicit approximate inverse preconditioning for solving characteristic two dimensional boundary value problems on a massive multiprocessor interface on a GPU are presented. The CUDA implementation issues of the proposed methods are also discussed.