Accelerating Haskell Array Codes with Algorithmic Skeletons on GPUs

hgpu.org » Programming » Algorithms » Accelerating Haskell Array Codes with Algorithmic Skeletons on GPUs

Accelerating Haskell Array Codes with Algorithmic Skeletons on GPUs

Sean Youngsung Lee

School of Computer Science and Engineering, The University of New South Wales

The University of New South Wales, 2011

@phdthesis{lee2011accelerating,

title={Accelerating Haskell Array Codes with Algorithmic Skeletons on GPUs},

author={Lee, S.Y.},

year={2011},

school={The University of New South Wales}

}

Download (PDF)

View

Source

2058

views

GPUs have been gaining popularity as general purpose parallel processors that deliver a performance to cost ratio superior to that of CPUs. However, programming on GPUs has remained a specialised area, as it often requires significant knowledge about the GPU architecture and platform-specific parallelisation of the algorithms that are implemented. Furthermore, the dominant programming models on GPUs limit functional decomposition of programs, as they require programmers to write separate functions to run on GPUs. I present and quantitatively evaluate a GPU programming system that provides a high-level abstraction to facilitate the use of GPUs for general purpose array processing. The presented programming system liberates programmers of low-level details and enables functional decomposition of programs, whilst providing the facilities to sufficiently exploit the parallel processing capability of GPUs. Fundamentally, the presented programming system allows programmers to focus on what to program on GPUs instead of how to program GPUs. The approach is based on algorithmic skeletons mapped to higher-order functions, which are commonly available in functional programming languages. The presented programming system (1) encapsulates the low-level control of GPUs and the GPU-specific parallelisation of the higher-order functions in algorithmic skeletons, (2) employs an embedded domain specific language as the interface that treats the parallel operations as expressions with parallel semantics, allowing functional decomposition of programs, and (3) manages the compilation of the embedded domain specific language into algorithmic skeleton instances and the execution of the algorithmic skeleton instances online. Programmers only need to write the operations to be executed on GPUs as expressions in the embedded domain specific language, and the presented programming system takes care of the rest. Although the concrete implementation presented in the thesis involves an embedded domain specific language in Haskell, Accelerate, and a specific GPU platform, CUDA, the approach in this thesis can be applied to other similar configurations with appropriate adjustments.

Tags: Algorithms, Computer science, CUDA, nVidia, Tesla S1070, Thesis

December 18, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org