Understanding the SIMD Efficiency of Graph Traversal on GPU

Yichao Cheng, Hong An, Zhitao Chen, Feng Li, Zhaohui Wang, Xia Jiang, Yi Peng
University of Science and Technology of China, Hefei, China
The 14th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2014), 2014


   title={Understanding the SIMD Efficiency of Graph Traversal on GPU},

   author={Cheng, Yichao and An, Hong and Chen, Zhitao and Li, Feng and Wang, Zhaohui and Jiang, Xia and Peng, Yi},



Download Download (PDF)   View View   Source Source   



Graph is a widely used data structure and graph algorithms, such as breadth-first search (BFS), are regarded as key components in a great number of applications. Recent studies have attempted to accelerate graph algorithms on highly parallel graphics processing unit (GPU). Although many graph algorithms based on large graphs exhibit abundant parallelism, their performance on GPU still faces formidable challenges, one of which is to map the irregular computation onto GPU’s vectorized execution model. In this paper, we investigate the link between graph topology and performance of BFS on GPU. We introduce a novel model to analyze the components of SIMD underutilization. We show that SIMD lanes are wasted either due to the workload imbalance between tasks, or to the heterogeneity of each task. We also develop corresponding metrics to quantify the SIMD efficiency for BFS on GPU. Finally, we demonstrate the applicability of the metrics by using them to profile the performance for different mapping strategies.
Rating: 1.8/5. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: