high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Compressed Multiple-Row Storage Format

Compressed Multiple-Row Storage Format

Zbigniew Koza, Maciej Matyka, Sebastian Szkoda, Lukasz Miroslaw

Faculty of Physics and Astronomy, University of Wroclaw, pl. M. Borna 9, 50-205 Wroclaw, Poland

arXiv:1203.2946v1 [physics.comp-ph] (13 Mar 2012)

@article{2012arXiv1203.2946K,

author={Koza}, Z. and {Matyka}, M. and {Szkoda}, S. and {Miros{l}aw}, {L}.},

title={"{Compressed Multiple-Row Storage Format}"},

journal={ArXiv e-prints},

archivePrefix={"arXiv"},

eprint={1203.2946},

primaryClass={"physics.comp-ph"},

keywords={Physics – Computational Physics, Computer Science – Distributed, Parallel, and Cluster Computing},

year={2012},

month={mar},

adsurl={http://adsabs.harvard.edu/abs/2012arXiv1203.2946K},

adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download (PDF)

View

Source

1649

views

A new format for storing sparse matrices is proposed for efficient sparse matrix-vector (SpMV) product calculation on modern throughput-oriented computer architectures. This format extends the standard compressed row storage (CRS) format and is easily convertible to and from it without any memory overhead. Computational performance of an SpMV kernel for the new format is determined for over 140 sparse matrices on two Fermi-class graphics processing units (GPUs) and the efficiency of the kernel, which peaks at 36 and 25 GFLOPS at single and double precision, respectively, is compared with that of five existing generic algorithms and industrial implementations. The efficiency of the new format is also measured as a function of the mean (mu) and of the standard deviation (sigma) of the number of matrix nonzero elements per row. The largest speedup is found for matrices with mu > 20 and mu > sigma > 1.5 and can be as high as 43%.

Tags: Algorithms, Compression, Computational Physics, CUDA, nVidia, nVidia GeForce GTX 480, Physics, Sparse matrix, Tesla C2070

March 15, 2012 by hgpu

Rating: 2.0/5. From 2 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Compressed Multiple-Row Storage Format

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Compressed Multiple-Row Storage Format

Share this:

Recent source codes

Most viewed papers (last 30 days)