Efficient CUDA polynomial preconditioned Conjugate Gradient solver for Finite Element computation of elasticity problems
College of Mechanics and Materials, Hohai University, 1 Xikang Road, Nanjing 210098, China
Mathematical Problems in Engineering
@article{zhang2013efficient,
title={Efficient CUDA polynomial preconditioned Conjugate Gradient solver for Finite Element computation of elasticity problems},
author={Zhang, Jianfei and Zhang, Lei},
year={2013}
}
Graphics Processing Unit (GPU) has obtained great success in scientific computations for its tremendous computational horsepower and very high memory bandwidth. This paper discusses the efficient way to implement polynomial preconditioned conjugate gradient solver for the finite element computation of elasticity on NVIDIA GPUs using Compute Unified Device Architecture (CUDA). Sliced Block ELLPACK (SBELL) format is introduced to store sparse matrix arising form finite element discretization of elasticity with fewer padding zeros than traditional ELLPACK-based formats. Polynomial preconditioning methods have been investigated both in convergence and running time. From the overall performance, the Least-Squares (L-S) polynomial method is chosen as a preconditioner in PCG solver to finite element equations derived form elasticity for its best results on different example meshes. In the PCG solver, mixed precision algorithm is used not only to reduce the overall computational, storage requirements and bandwidth but make full use of the capacity of the GPU devices. With SBELL format and mixed precision algorithm, the GPU-based L-S preconditioned CG can get a speedup of about 7-9 to CPU-implementation.
September 14, 2013 by hgpu