12839

NAS Parallel Benchmarks for GPGPUs using a Directive-based Programming Model

Rengan Xu, Xiaonan Tian, Sunita Chandrasekaran, Yonghong Yan, Barbara Chapman
Department of Computer Science, University of Houston, Houston TX, 77004 USA
27th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2014), 2014

@article{xu2014parallel,

   title={NAS Parallel Benchmarks for GPGPUs using a Directive-based Programming Model},

   author={Xu, Rengan and Tian, Xiaonan and Chandrasekaran, Sunita and Yan, Yonghong and Chapman, Barbara},

   year={2014}

}

Download Download (PDF)   View View   Source Source   

2906

views

The broad adoption of accelerators boosts the interest in accelerator programming. Accelerators such as GPGPUs are optimized for throughput and offer high GFLOPS and memory bandwidth. CUDA has been adopted quite rapidly but it is proprietary and only applicable to GPUs, and the difficulty in writing efficient CUDA code has kindled the necessity to create higher-level programming approaches such as OpenACC. Directive-based programming models such as OpenMP and OpenACC offer programmers an option to rapidly create prototype applications by adding annotations to guide compiler optimizations. In this paper we study the effectiveness of a high-level directive based programming model, OpenACC, for parallelizing NAS Parallel Benchmarks (NPB) on GPGPUs. We present the application of techniques such as array privatization, memory coalescing, cache optimization and examine their impact on the performance of the benchmarks. The right choice or combination of techniques/hints are crucial for compilers to generate highly efficient codes tuned to a particular type of accelerator. Poorly selected choice or combination of techniques can lead to degraded performance. We also propose a new clause, "scan", that handles scan operations for arbitrary input array size. We hope that the practices discussed in this paper will provide useful guidance to users to effectively migrate their sequential/CPU-parallel codes to GPGPU architectures and achieve optimal performance.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: