Fast scan algorithms on graphics processors

Yuri Dotsenko, Naga K. Govindaraju, Peter P. Sloan, Charles Boyd, John Manferdelli
Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA
In ICS ’08: Proceedings of the 22nd annual international conference on Supercomputing (2008), pp. 205-213


   title={Fast scan algorithms on graphics processors},

   author={Dotsenko, Y. and Govindaraju, N.K. and Sloan, P.P. and Boyd, C. and Manferdelli, J.},

   booktitle={Proceedings of the 22nd annual international conference on Supercomputing},





Download Download (PDF)   View View   Source Source   



Scan and segmented scan are important data-parallel primitives for a wide range of applications. We present fast, work-efficient algorithms for these primitives on graphics processing units (GPUs). We use novel data representations that map well to the GPU architecture. Our algorithms exploit shared memory to improve memory performance. We further improve the performance of our algorithms by eliminating shared-memory bank conflicts and reducing the overheads in prior shared-memory GPU algorithms. Furthermore, our algorithms are designed to work well on general data sets, including segmented arrays with arbitrary segment lengths. We also present optimizations to improve the performance of segmented scans based on the segment lengths. We implemented our algorithms on a PC with an NVIDIA GeForce 8800 GPU and compared our results with prior GPU-based algorithms. Our results indicate up to 10x higher performance over prior algorithms on input sequences with millions of elements.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: