high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA

Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA

Quoc-Nam Tran

Lamar University

Seventh International Conference on Information Technology, 2010, ITNG, pp.7-12

DOI:10.1109/ITNG.2010.230

BibTeX

Source

1758

views

Finding the all-pairs shortest-paths on a large graph is a fundamental problem in many practical applications such as bioinformatics, internet node traffic and network routing. In this paper, we present the designs of two efficient parallel algorithms for many-core GPUs using CUDA. Our algorithms expose substantial fine-grained parallelism while maintaining minimal global communication. By using the global scope of the GPU’s global memory, coalescing the global memory reads and writes, and avoiding on-chip shared memory bank conflicts, we are able to achieve a large performance benefit with a speed-up of 2,500x on a desktop computer in comparison with a single core program. Our algorithms are scalable, which can handle graphs with size larger than the memory available on the GPUs and when multiple GPUs are added into the system.

Tags: Algorithms, Computer science, CUDA, nVidia, Path problems

March 5, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA

Your response

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)

Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)