high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Fast Linear Algebra on GPU

Fast Linear Algebra on GPU

Lukas Polok, Pavel Smrz

Brno University of Technology, Faculty of Information Technology, IT4Innovations Centre of Excellence, Bozetechova 2, 61266 Brno, Czech Republic

IEEE conference proceedings, Liverpool, GB, IEEE CS, 2012

@inproceedings{polok2012fast,

title={Fast Linear Algebra on GPU},

author={Polok, L. and Smr{v{z}}, P.},

booktitle={IEEE conference proceedings},

pages={6},

organization={IEEE Computer Society},

year={2012}

}

Download (PDF)

View

Source

1807

views

GPUs have been successfully used for acceleration of many mathematical functions and libraries. A common limitation of those libraries is the minimal size of primitives being handled, in order to achieve a significant speedup compared to their CPU versions. The minimal size requirement can prove prohibitive for many applications. It can be loosened by batching operations in order to have sufficient amount of data to perform the calculation maximally efficiently on the GPU. A fast OpenCL implementation of two basic vector functions – vector reduction and vector scaling – is described in this paper. Its performance is analyzed by running benchmarks on two of the most common GPUs in use – Tesla and Fermi GPUs from NVIDIA. Reported experimental results show that our implementation significantly outperforms the current state-of-the-art GPUbased basic linear algebra library CUBLAS.

Tags: Computer science, CUBLAS, Linear Algebra, nVidia, nVidia GeForce GTX 260, nVidia GeForce GTX 590, OpenCL

July 28, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Fast Linear Algebra on GPU

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Fast Linear Algebra on GPU

Share this:

Recent source codes

Most viewed papers (last 30 days)