Harnessing the Power of GPUs without Losing Abstractions in SaC and ArrayOL: A Comparative Study

hgpu.org » Applications » Computer science » Harnessing the Power of GPUs without Losing Abstractions in SaC and ArrayOL: A Comparative Study

Harnessing the Power of GPUs without Losing Abstractions in SaC and ArrayOL: A Comparative Study

Jing Guo, Wendell Rodrigues, Jeyarajan Thiyagalingamz, Frederic Guyomarch, Pierre Boulet, Sven-Bodo Scholz

Departement of Computer Science, University of Hertfordshire

16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, Anchorage (Alaska), HIPS 2011, 2011

BibTeX

Download (PDF)

View

Source

1810

views

Over recent years, using Graphics Processing Units (GPUs) has become as an effective method for increasing the performance of many applications. However, these performance benefits from GPUs come at a price. Firstly extensive programming expertise and intimate knowledge of the underlying hardware are essential for gaining good speedups. Secondly, the expressibility of GPU-based programs are not powerful enough to retain the high-level abstractions of the solutions. Although the programming experience has been significantly improved by existing frameworks like CUDA and OpenCL, it is still a challenge to effectively utilise these devices while still retaining the programming abstractions. To this end, performing a source-to-source transformation, whereby a high-level language is mapped to CUDA or OpenCL, is an attractive option. In particular, it enables one to retain high-level abstractions and to harness the power of GPUs without any expertise on the GPGPU programming. In this paper, we compare and analyse two such schemes. One of them is a transformation mechanism for mapping a image/signal processing domain-specific language, ArrayOL, to OpenCL. The other one is a transformation route for mapping a high-level general purpose array processing language, Single Assignment C (SaC) to CUDA. Using a real-world image processing application as a running example, we demonstrate that albeit the fact of being general purpose, the array processing language be used to specify complex array access patterns generically. Performance of the generated CUDA code is comparable to the OpenCL code created from domain-specific language.

Tags: Computer science, CUDA, nVidia, nVidia GeForce GTX 480, OpenCL, Performance, Video encoding

September 25, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org