high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Architecture-and Workload-Aware Heterogeneous Algorithms for Sparse Matrix Vector Multiplication

Architecture-and Workload-Aware Heterogeneous Algorithms for Sparse Matrix Vector Multiplication

Sivaramakrishna Bharadwaj Indarapu, Manoj Maramreddy, Kishore Kothapalli

Center for Security, Theory and Algorithmic Research (CSTAR), International Institute of Information Technology, Hyderabad, Gachibowli, Hyderabad, India, 500 032

19th International Conference on Parallel and Distributed Systems, 2013

BibTeX

Download (PDF)

View

Source

1932

views

Multiplying a sparse matrix with a vector, denoted spmv, is a fundamental operation in linear algebra with several applications. Hence, efficient and scalable implementation of spmv has been a topic of immense research. Recent efforts are aimed at implementations on GPUs, multicore architectures, and such emerging computational platforms. Owing to the highly irregular nature of spmv, it is observed that GPUs and CPUs can offer comparable performance. In this paper, we propose three heterogeneous algorithms for spmv that simultaneously utilize both the CPU and the GPU. This is shown to lead to better resource utilization apart from performance gains. Our experiments of the work division schemes on standard datasets indicate that it is not in general possible to choose the most appropriate scheme given a matrix. We therefore consider a class of sparse matrices that exhibit a scale-free nature and identify a scheme that works well for such matrices. Finally, we use simple and effective mechanisms to determine the appropriate amount of work to be alloted to the CPU and the GPU.

Tags: Algorithms, Computer science, CUDA, Heterogeneous systems, Linear Algebra, nVidia, nVidia GeForce GTX 480, Sparse matrix

October 21, 2013 by hgpu

No votes yet.

Please wait...

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Architecture-and Workload-Aware Heterogeneous Algorithms for Sparse Matrix Vector Multiplication

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)

Architecture-and Workload-Aware Heterogeneous Algorithms for Sparse Matrix Vector Multiplication

Share this:

Recent source codes

Most viewed papers (last 30 days)