8750

Portable Mapping of Data Parallel Programs to OpenCL for Heterogeneous Systems

Dominik Grewe, Zheng Wang, Michael F.P. O’Boyle
School of Informatics, University of Edinburgh
International Symposium on Code Generation and Optimization (CGO), 2013
@article{grewe2013portable,

   title={Portable Mapping of Data Parallel Programs to OpenCL for Heterogeneous Systems},

   author={Grewe, D. and Wang, Z. and O’Boyle, M.F.P.},

   year={2013}

}

Download Download (PDF)   View View   Source Source   

745

views

General purpose GPU based systems are highly attractive as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This paper presents a compiler based approach to automatically generate optimized OpenCL code from data-parallel OpenMP programs for GPUs. Such an approach brings together the benefits of a clear high level language (OpenMP) and an emerging standard (OpenCL) for heterogeneous multi-cores. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses predictive modeling to automatically determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on the multi-core host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on two distinct GPU based systems: Core i7/NVIDIA GeForce GTX 580 and Core i7/AMD Radeon 7970. We achieved average (up to) speedups of 4.51x and 4.20x (143x and 67x) respectively over a sequential baseline. This is, on average, a factor 1.63 and 1.56 times faster than a hand-coded, GPU-specific OpenCL implementation developed by independent expert programmers.
VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)
Portable Mapping of Data Parallel Programs to OpenCL for Heterogeneous Systems, 5.0 out of 5 based on 1 rating

* * *

* * *

Like us on Facebook

HGPU group

193 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1329 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: