high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

Hector Ortega-Arranz, Yuri Torres, Diego R. Llanos, Arturo Gonzalez-Escribano

Dpto. Informatica, Universidad de Valladolid

13th International Conference Computational and Mathematical Methods in Science and Engineering (CMMSE 2013), 2013

@article{ortega2013tuned,

title={A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem},

author={Ortega-Arranz, Hector and Torres, Yuri and Llanos, Diego R and Gonzalez-Escribano, Arturo},

year={2013}

}

Download (PDF)

View

Source

2245

views

The All-Pair Shortest-Path (APSP) problem is a well-known problem in graph theory whose objective is to find the shortest paths between any pair of nodes. Computing the distances from one source node to the rest and repeating this process for every node of the graph is an adequate solution for sparse graphs. During the last years the application of GPU devices have increased to accelerate this kind of problems. While the correctness of an NVIDIA CUDA implementation of this algorithm is easy to achieve, exploiting the GPU capabilities to obtain a good performance is a task for CUDA experienced programmers. A typical code tuning strategy is the selection of an appropriate threadBlocks size. Besides this, the concurrent deployment of several kernels that computes distances from different sources, also accelerates the execution times. In this paper we show that an adequate combination of both strategies represents a 11.5 % performance improvement between different, recommended CUDA configurations for the most costly kernel of the APSP problem.

Tags: Algorithms, Computer science, CUDA, Graph theory, nVidia, nVidia GeForce GTX 480

May 25, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)