Relational joins on graphics processors
Hong Kong University of Science and Technology, China
In SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (2008), pp. 511-524.
@conference{he2008relational,
title={Relational joins on graphics processors},
author={He, B. and Yang, K. and Fang, R. and Lu, M. and Govindaraju, N. and Luo, Q. and Sander, P.},
booktitle={Proceedings of the 2008 ACM SIGMOD international conference on Management of data},
pages={511–524},
year={2008},
organization={ACM}
}
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for writing to random memory locations, efficient inter-processor communication, and a programming model for general-purpose computing. Taking advantage of these new features, we design a set of data-parallel primitives such as split and sort, and use these primitives to implement indexed or non-indexed nested-loop, sort-merge and hash joins. Our algorithms utilize the high parallelism as well as the high memory bandwidth of the GPU, and use parallel computation and memory optimizations to effectively reduce memory stalls. We have implemented our algorithms on a PC with an NVIDIA G80 GPU and an Intel quad-core CPU. Our GPU-based join algorithms are able to achieve a performance improvement of 2-7X over their optimized CPU-based counterparts.
November 4, 2010 by hgpu