Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

hgpu.org » Applications » Computer science » Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

Chittampally Vasanth Raja, Srinivas Balasubramanian, Prakash S. Raghavendra

Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India

International Journal of Distributed and Parallel Systems (IJDPS) Vol.3, No.2, March 2012, arXiv:1204.3052v1 [cs.DC] (13 Apr 2012)

DOI:10.5121/ijdps.2012.3209

BibTeX

Download (PDF)

View

Source

1711

views

The vision of super computer at every desk can be realized by powerful and highly parallel CPUs or GPUs or APUs. Graphics processors once specialized for the graphics applications only, are now used for the highly computational intensive general purpose applications. Very expensive GFLOPs and TFLOP performance has become very cheap with the GPGPUs. Current work focuses mainly on the highly parallel implementation of Matrix Exponentiation. Matrix Exponentiation is widely used in many areas of scientific community ranging from highly critical flight, CAD simulations to financial, statistical applications. Proposed solution for Matrix Exponentiation uses OpenCL for exploiting the hyper parallelism offered by the many core GPGPUs. It employs many general GPU optimizations and architectural specific optimizations. This experimentation covers the optimizations targeted specific to the Scientific Graphics cards (Tesla-C2050). Heterogeneous Highly Parallel Matrix Exponentiation method has been tested for matrices of different sizes and with different powers. The devised Kernel has shown 1000X speedup and 44 fold speedup with the naive GPU Kernel.

Tags: Computer science, Heterogeneous systems, Mathematical Software, Numerical Analysis, nVidia, OpenCL, Tesla C2050

April 16, 2012 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

Share this:

Recent source codes

Most viewed papers (last 30 days)