17737

Automatic Scan Parallelization in OpenMP

Maicol Zegarra, Marcio Pereira, Xavier Martorell, Guido Araujo
Institute of Computing, UNICAMP, Campinas, Brazil
6th Workshop on Parallel Programming Models (MPP), 2017

@article{zegarra2017automatic,

   title={Automatic Scan Parallelization in OpenMP},

   author={Zegarra, Maicol and Pereira, Marcio and Mortarell, Xavier and Araujo, Guido},

   year={2017}

}

Download Download (PDF)   View View   Source Source   

4464

views

Prefix Scan (or simply scan) is an operator that computes all the partial sums of a vector. A scan operation results in a vector where each element is the sum of the preceding elements in the original vector up to the corresponding position. Scan is a key operation in many relevant problems like sorting, lexical analysis, string comparison, image filtering among others. Although there are libraries that provide hand-parallelized implementations of scan in CUDA and OpenCL, no automatic parallelization solution exist to this operator in OpenMP. This paper proposes a new clause to OpenMP which enables the automatic synthesis of parallel scan. By using the proposed clause a programmer can considerably reduce the complexity of designing scan based algorithms, thus allowing he/she to focus the attention on the problem and not on learning new parallel programming models or languages. Scan was designed in AClang (www.aclang.org), an open-source LLVM/Clang compiler framework that implements the recently released OpenMP 4.X Accelerator Programming Model. AClang automatically converts OpenMP 4.X annotated program regions to OpenCL. Experiments running a set of typical scan based algorithms on NVIDIA, Intel and ARM GPUs reveal that the performance of the proposed OpenMP clause is equivalent to that achieved when using OpenCL library calls, with the advantage of a simpler programming complexity.
Rating: 3.8/5. From 4 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: