Parallel Distributed Breadth First Search on the Kepler Architecture
Istituto per le Applicazioni del Calcolo, IAC-CNR, Rome, Italy
arXiv:1408.1605 [cs.DC], (7 Aug 2014)
@article{2014arXiv1408.1605P,
author={Bisson, Mauro and Bernaschi, Massimo and Mastrostefano, Enrico},
title={Parallel Distributed Breadth First Search on the Kepler Architecture},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1408.1605},
primaryClass={"cs.DC"},
keywords={Distributed, Parallel, and Cluster Computing},
year={2014},
month={aug}
}
We present the results obtained by using an evolution of our CUDA-based solution for the exploration, via a Breadth First Search, of large graphs. This latest version exploits at its best the features of the Kepler architecture and relies on a 2D decomposition of the adjacency matrix to reduce the number of communications among the GPUs. The final result is a code that can visit 400 billion edges in a second by using a cluster equipped with 4096 Tesla K20X GPUs.
August 9, 2014 by hgpu