high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Parallel Breadth First Search on GPU Clusters

Parallel Breadth First Search on GPU Clusters

Zhisong Fu, Harish Kumar Dasari, Martin Berzins, Bryan Thompson

SYSTAP, LLC. 4501 Tower Road, Greensboro, NC 27410, USA

Scientific Computing and Imaging Institute, University of Utah, UUSCI-2014-002, 2014

@article{fu2014parallel,

title={Parallel Breadth First Search on GPU Clusters},

author={Fu, Zhisong and Dasari, Harish Kumar and Berzins, Martin and Thompson, Bryan},

year={2014}

}

Download (PDF)

View

Source

Source codes

Package:

mapgraph

2661

views

Fast, scalable, low-cost, and low-power execution of parallel graph algorithms is important for a wide variety of commercial and public sector applications. Breadth First Search (BFS) imposes an extreme burden on memory bandwidth and network communications and has been proposed as a benchmark that may be used to evaluate current and future parallel computers. Hardware trends and manufacturing limits strongly imply that many core devices, such as NVIDIA GPUs and the Intel Xeon Phi, will become central components of such future systems. GPUs are well known to deliver the highest FLOPS/watt and enjoy a very significant memory bandwidth advantage over CPU architectures. Recent work has demonstrated that GPUs can deliver high performance for parallel graph algorithms and, further, that it is possible to encapsulate that capability in a manner that hides the low level details of the GPU architecture and the CUDA language but preserves the high throughput of the GPU. We extend previous research on GPUs and on scalable graph processing on super-computers and demonstrate that a high-performance parallel graph machine can be created using commodity GPUs and networking hardware.

Tags: Algorithms, Computer science, CUDA, GPU cluster, Graph theory, Network communications, nVidia, Package, Tesla K20

August 11, 2014 by hgpu

Rating: 2.5/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Parallel Breadth First Search on GPU Clusters

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Parallel Breadth First Search on GPU Clusters

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)