high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Multi-GPU Implementation of the Uniformization Method for Solving Markov Models

Multi-GPU Implementation of the Uniformization Method for Solving Markov Models

Marek Karwacki, Beata Bylina, Jaroslaw Bylina

Institute of Mathematics, Marie Curie-Sklodowska University, Pl. M. Curie-Sklodowskiej 5, 20-031 Lublin, Poland

Preprints of the Federated Conference on Computer Science and Information Systems pp. 561-565, 2012

BibTeX

Download (PDF)

View

Source

1934

views

Markovian models can generate very large sparse matrices, which are difficult to store and solve. A useful method for finding transient probabilities in Markovian models is the uniformization. The aim of this paper is to show that the performance of the uniformization can be improved using multiGPU architecture. We propose partitioning scheme for HYB sparse matrix storage format and some optimization techniques adjusted so as to minimize communication between GPUs during iterative sparse matrix-vector multiplication, which is the most time consuming step. The results of experiments show that on multi-GPU we can solve larger matrices than on single device and accelerate computations in comparison to a multithreaded CPU. Computational test have been carried out in double precision for a wireless network models. Using multi-GPU we were able to solve model which is described by a matrix of the size 3.6×10^7.

Tags: Computer science, CUDA, nVidia, Sparse matrix, Tesla M2050

August 31, 2012 by hgpu

Rating: 2.0/5. From 1 vote.

Please wait...

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Multi-GPU Implementation of the Uniformization Method for Solving Markov Models

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)

Multi-GPU Implementation of the Uniformization Method for Solving Markov Models

Share this:

Recent source codes

Most viewed papers (last 30 days)