high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

Javier Oses Villanueva

Departamento de Matematica Aplicada y Estadistica e I.O., University of the Basque Country UPV/EHU, and Ikerbasque

Universidad de Zaragoza, 2013

@article{villanueva2013cuda,

title={CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code.},

author={Villanueva, Javier Os{‘e}s},

year={2013}

}

Download (PDF)

View

Source

2216

views

The FEM has proven to be one of the most efficient methods for solving differential equations. Designed to run on different computer architectures, technological improvements have led over the years to the fast solution of larger and larger problems. Among these technological improvements, we emphasize the development of GPU (Graphic Processor Unit). Scientific programming in graphics cards was extremely difficult until 2006 the company NVIDIA developed CUDA (Compute Unified Device Architecture). It is a programming language designed for generic computing which does not require knowledge of traditional graphics programming. GPUs are capable of performing a large number of operations simultaneously. This capability makes them very attractive for use in FEM. One of the parts of the FEM which requires large computational capacity is the solution of systems of linear equations. In this work, an algorithm for solving systems of linear equations in CUDA has been implemented. It will be applied as a part of a hp-FEM code that tries to solve Laplace equation. The aim of this study is to compare the performance of an an implementation of a solver in CUDA vs. a C implementation and check if CUDA has advantages over traditional programming. For that purpose, we select an algorithm suitable for GPU programming. The iterative algorithms have properties that fits to CUDA programming architecture. However, the use of these algorithms require from double precision arithmetic to minimize round-off effects. Nowadays, only high performance GPUs are able to work in double precision. FEM matrices are sparse and the use of compression format for the system matrix is needed. Exist multiple compression formats and we select one which better fits to the matrix structure that FEM generates in our problem. The implementation in CUDA introduces improvements in execution times compared to traditional programming in C. Recent works has proved that it can be obtained programs that works until 80 times faster. But, this result can not be generalized because the improvements depends on differential equation, boundary conditions, mesh generation, FEM, model of GPU, version of CUDA(now 5.0), and of course implementation.

Tags: Algorithms, Computer science, CUDA, Differential equations, Finite element method, Laplace and Poisson equation, Linear Algebra, nVidia, Thesis

May 15, 2013 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

Share this:

Recent source codes

Most viewed papers (last 30 days)