Designing scalable many-core parallel algorithms for min graphs using CUDA
Lamar University, USA
IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010
@conference{tran2010designing,
title={Designing scalable many-core parallel algorithms for min graphs using CUDA},
author={Tran, Q.N.},
booktitle={Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on},
pages={1–8},
year={2010},
organization={IEEE}
}
Removing redundant edges on a large graph is a fundamental problem in many practical applications such as verification of real-time systems and network routing. In this paper, we present the designs of scalable and efficient parallel algorithms for multiple many-core GPU devices using CUDA. Our algorithms expose substantial fine-grained parallelism while maintaining minimal global communication. By using the global scope of the GPU’s global memory, coalescing the global memory reads and writes, and avoiding on-chip shared memory bank conflicts, we are able to achieve a large performance benefit with a speed-up of 2,500x on a desktop computer in comparison with a single core CPU program. We report our experiments on large graphs with up to 29 K vertices using multiple GPU devices.
March 7, 2011 by hgpu