A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

hgpu.org » Programming » Algorithms » A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

Tomohiro Okuyama, Fumihiko Ino, Kenichi Hagihara

Graduate School of Information Science and Technology, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan

Proceedings of 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications (2008)

DOI:10.1109/ISPA.2008.40

BibTeX

Download (PDF)

View

Source

2336

views

This paper proposes a fast method for computing the costs of all-pairs shortest paths (APSPs) on the graphics processing unit (GPU). The proposed method is implemented using compute unified device architecture (CUDA), which offers us a development environment for performing general-purpose computation on the GPU. Our method is based on Harish’s iterative algorithm that computes the cost of the single-source shortest path (SSSP) for every source vertex. We present that exploiting task parallelism in the APSP problem allows us to efficiently use on-chip memory in the GPU, reducing the amount of data being transferred from relatively slower off-chip memory. Furthermore, our task parallel scheme is useful to exploit a higher parallelism, increasing the efficiency with highly threaded code. As a result, our method is 3.4-15 times faster than the prior method. Using on-chip memory, our method eliminates approximately 20% of data loads from off-chip memory.

Tags: Algorithms, All-Pairs Distance, Computer science, CUDA, nVidia, nVidia GeForce 8800 GTS, Path problems

November 5, 2010 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

Share this:

Recent source codes

Most viewed papers (last 30 days)