Parallel Prefix Scan with Compute Unified Device Architecture (CUDA)
JNTUA college of Engineering, Pulivendula, A.P. India
9th IRF International Conference, 2014
@article{lavanya2014parallel,
title={Parallel Prefix Scan with Compute Unified Device Architecture (CUDA)},
author={Lavanya, B. Muni},
year={2014}
}
Parallel prefix scan, also known as parallel prefix sum, is a building block for many parallel algorithms including polynomial evaluation, sorting and building data structures. This paper introduces prefix scan and also describes a step-by-step procedure to implement prefix scan efficiently with Compute Unified Device Architecture (CUDA). This paper starts with a basic naive algorithm and proceeds through more advanced techniques to obtain best performance.
June 11, 2014 by hgpu