Data-driven versus Topology-driven Irregular Computations on GPUs

hgpu.org » Programming » Algorithms » Data-driven versus Topology-driven Irregular Computations on GPUs

Data-driven versus Topology-driven Irregular Computations on GPUs

Rupesh Nasre, Martin Burtscher, Keshav Pingali

The University of Texas, Austin, Texas, USA

27th IEEE International Parallel & Distributed Processing Symposium, 2013

@inproceedings{nasre2013data,

title={Data-driven versus Topology-driven Irregular Computations on GPUs},

author={Nasre, Rupesh and Burtscher, Martin and Pingali, Keshav},

booktitle={27th IEEE International Parallel & Distributed Processing Symposium},

year={2013}

}

Download (PDF)

View

Source

2498

views

Irregular algorithms are algorithms with complex main data structures such as directed and undirected graphs, trees, etc. A useful abstraction for many irregular algorithms is its operator formulation in which the algorithm is viewed as the iterated application of an operator to certain nodes, called active nodes, in the graph. Each operator application, called an activity, usually touches only a small part of the overall graph, so nonoverlapping activities can be performed in parallel. In topologydriven implementations, all nodes are assumed to be active so the operator is applied everywhere in the graph even if there is no work to do at some nodes. In contrast, in data-driven implementations the operator is applied only to nodes at which there might be work to do. Multicore implementations of irregular algorithms are usually data-driven because current multicores only support small numbers of threads and work-efficiency is important. Conversely, many irregular GPU implementations use a topology-driven approach because work inefficiency can be counterbalanced by the large number of GPU threads. In this paper, we study data-driven and topology-driven implementations of six important graph algorithms on GPUs. Our goal is to understand the tradeoffs between these implementations and how to optimize them. We find that data-driven versions are generally faster and scale better despite the cost of maintaining a worklist. However, topology-driven versions can be superior when certain algorithmic properties are exploited to optimize the implementation. These results led us to devise hybrid approaches that combine the two techniques and outperform both of them.

Tags: Algorithms, Computer science, CUDA, nVidia, nVidia Quadro FX 6000

June 4, 2013 by hgpu

Rating: 2.5/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org