## Using OpenCL for Implementing Simple Parallel Graph Algorithms

Department of Computer Science, University of Auckland, Auckland, New Zealand

The 2011 World Congress in Computer Science, Computer Engineering, and Applied Computing (WORLDCOMP’11), 2011

@article{dinneen2011using,

title={Using OpenCL for Implementing Simple Parallel Graph Algorithms},

author={Dinneen, M.J. and Khosravani, M. and Probert, A.},

year={2011}

}

For the typical graph algorithms encountered most frequently in practice (such as those introduced in typical entry-level algorithms courses: graph searching/traversals, shortest paths problems, strongly connected components and minimum spanning trees) we want to consider practical non-sequential platforms such as the emergence of cost effective General-Purpose computation on Graphics Processing Units (GPGPU). In this paper we provide two simple design techniques that allow a nonspecialist computer scientist to harness the power of their GPUs as parallel compute devices. These two natural ideas are (a) using a host CPU script to synchronize a distributed view of a graph algorithm where each node of the input graph is associated with a unique processing thread ID and (b) using GPU atomic operations to synchronize a single kernel launch where a set of threads, upper-bounded by at most the number of streaming processing units available, continuously stay active and time-slice the total workload until the algorithm completes. We give concrete comparative implementations of both of these approaches for the simple problem of exploring a graph using breadth-first search. Finally we conclude that OpenCL, in addition to CUDA, is a natural tool for modern graph algorithm designers, especially those who are not experts of GPU hardware architecture, to develop real-world usable graph applications.

October 24, 2011 by hgpu