12244

Parallel Prefix Scan with Compute Unified Device Architecture (CUDA)

B. Muni Lavanya
JNTUA college of Engineering, Pulivendula, A.P. India
9th IRF International Conference, 2014
BibTeX

Download Download (PDF)   View View   Source Source   

1847

views

Parallel prefix scan, also known as parallel prefix sum, is a building block for many parallel algorithms including polynomial evaluation, sorting and building data structures. This paper introduces prefix scan and also describes a step-by-step procedure to implement prefix scan efficiently with Compute Unified Device Architecture (CUDA). This paper starts with a basic naive algorithm and proceeds through more advanced techniques to obtain best performance.
No votes yet.
Please wait...

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org