high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » GPU-accelerated WZ Factorization with the Use of the CUBLAS Library

GPU-accelerated WZ Factorization with the Use of the CUBLAS Library

Beata Bylina, Jaroslaw Bylina

Institute of Mathematics, Marie Curie-Sklodowska University, Pl. M. Curie-Sklodowskiej 5, 20-031 Lublin, Poland

Preprints of the Federated Conference on Computer Science and Information Systems pp. 537-543, 2012

@article{bylina2012gpu,

title={GPU-accelerated WZ Factorization with the Use of the CUBLAS Library},

author={Bylina, Beata and Bylina, Jaroslaw},

year={2012}

}

Download (PDF)

View

Source

1885

views

We present a novel implementation of a dense, square, non-structured matrix factorization algorithm, namely the WZ factorization – with the use of graphics processors (GPUs) and CPUs to gain a high performance at a low cost. We rewrite this factorization as operations on blocks of matrices and vectors. We have implemented our block-vector algorithm on GPUs with the use of an appropriate (and ready-to-use) GPU-accelerated mathematical library, namely the CUBLAS library. We compared the performance of our algorithm with CPU implementations. In particular, our implementation on an NVIDIA Tesla C2050 GPU outperforms a CPU-based implementation. Our results show that the algorithm scales well with the size of matrices; moreover, the larger the matrix, the better the performance. We also discuss the impact of the size of the matrix and the use of ready-to-use mathematical libraries on the numerical accuracy.

Tags: Computer science, CUBLAS, CUDA, Factorization, nVidia, Tesla C2050

September 3, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

GPU-accelerated WZ Factorization with the Use of the CUBLAS Library

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

GPU-accelerated WZ Factorization with the Use of the CUBLAS Library

Share this:

Recent source codes

Most viewed papers (last 30 days)