high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Bogdan Oancea, Tudorel Andrei

"Nicolae Titulescu" University of Bucharest

Computational Methods in Social Sciences (CMSS), Vol. I, Issue 2/2013, 2013

BibTeX

Download (PDF)

View

Source

1801

views

Nowadays, the paradigm of parallel computing is changing. CUDA is now a popular programming model for general purpose computations on GPUs and a great number of applications were ported to CUDA obtaining speedups of orders of magnitude comparing to optimized CPU implementations. Hybrid approaches that combine the message passing model with the shared memory model for parallel computing are a solution for very large applications. We considered a heterogeneous cluster that combines the CPU and GPU computations using MPI and CUDA for developing a high performance linear algebra library. Our library deals with large linear systems solvers because they are a common problem in the fields of science and engineering. Direct methods for computing the solution of such systems can be very expensive due to high memory requirements and computational cost. An efficient alternative are iterative methods which computes only an approximation of the solution. In this paper we present an implementation of a library that uses a hybrid model of computation using MPI and CUDA implementing both direct and iterative linear systems solvers. Our library implements LU and Cholesky factorization based solvers and some of the non-stationary iterative methods using the MPI/CUDA combination. We compared the performance of our MPI/CUDA implementation with classic programs written to be run on a single CPU.

Tags: Computer science, CUDA, Factorization, Heterogeneous systems, Linear Algebra, Memory model, MPI, nVidia, nVidia GeForce GTX 280

December 29, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)