Automatic program analysis for data parallel kernels

hgpu.org » Applications » Computer science » Automatic program analysis for data parallel kernels

Automatic program analysis for data parallel kernels

Calin Juravle

Department of Information and Computing Sciences, Utrecht University, Utrecht, The Netherlands

Utrecht University, Utrecht, The Netherlands, 2011

@phdthesis{juravle2011automatic,

title={Automatic program analysis for data parallel kernels},

author={Juravle, C.},

year={2011},

school={Utrecht University}

}

Download (PDF)

View

Source

1357

views

It is widely known that GPUs have more computational power and expose a far greater level of parallelism than conventional CPUs. Despite their high potential, GPUs are not yet a popular choice in practice, mainly because of their high programming complexity. The complexity derives from two factors. First, the existing programming models are tied to the underlying GPU architectures. Because of this, GPU programming methodology is different from conventional CPU programming, and forces developers to think in different terms. Second, a lot of architectural details that heavily influence program performance are not exposed to the programmer. The consequence is that in order to get fast programs developers must have a deep understanding of the targeted GPU platform. Moreover, mapping existing sequential programs to GPU is even more difficult. The reason is that not every part of a given program is suitable to be mapped to the GPU. Identifying those suitable parts implies looking for dependencies and recognizing data parallel patterns. This is not an easy task if the program code base is very large. The aim of this thesis is to ease the problem of mapping existing sequential programs to a GPU. To this end, we explore automatic program analyses that enable automatic and guided transformation of sequential programs to data parallel GPU kernels. Our main contributions consist of identification and implementation of key program analyses that enable such transformations. The result is a system that can identify kernel regions in a sequential program, detect GPU specific optimisations and provide additional kernel information that can be used to estimate performance. The work serves as a foundation for an automatic GPU parallelization system.

Tags: Computer science, CUDA, nVidia, Optimization, Performance, Programming techniques, Thesis

October 20, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org