GPU performance prediction using parametrized models

hgpu.org » Applications » Computer science » GPU performance prediction using parametrized models

GPU performance prediction using parametrized models

Andreas Resios

Utrecht University

Utrecht University, 2011

@article{resios2011gpu,

title={GPU performance prediction using parametrized models},

author={Resios, A. and Holdermans, V.F.D.S.},

year={2011}

}

Download (PDF)

View

Source

1360

views

Compilation on modern architectures has become an increasingly difficult challenge with the evolution of computers and computing needs. In particular, programmers expect the compiler to produce optimized code for a variety of hardware, making the most of their theoretical performance. For years this was not a problem because hardware vendors consistently delivered increases in clock rates and instruction-level parallelism, so that single-threaded programs achieved speedup on newer processors without any modification. Nowadays to increase performance and overcome physical limitations, the hardware industry favours multi-core CPUs and massively parallel hardware accelerators (GPUs, FPGAs), and software has to be written explicitly in a multi-threaded or multi-process manner to gain performance. Thus, the performance problem has shifted from hardware designers to compiler writers and software developers who now have to perform parallelization. Such a transformation involves identifying and mapping independent data and computation to a complex hierarchy of memory, computing, and interconnection resources. When performing parallelization it is important to take into account the overhead introduced by communication, thread spawning, and synchronization. If the overhead is high the introduced optimization can lead to a performance loss. Thus, an important question in this process is to evaluate whether the optimization brings any performance improvements. The answer is usually computed using a performance model which is an abstraction of the target hardware [29, 30]. Our research addresses this problem in the context of parallelizing sequential programs to GPU platforms. The main result is a GPU performance model for data-parallel programs which predicts the execution time and identifies bottlenecks of GPU programs. During the thesis we will present the factors which in uence GPU performance and show how our model takes them into account. We validated our model in the context of a production ready analysis tool vfEmbedded [33] which combines static and dynamic analyses to parallelize C code for heterogeneous platforms. Since the tool has an interactive compilation work-flow, our model not only estimates execution time but also computes several metrics which help users decide if their program is worth porting to the GPU.

Tags: Computer science, CUDA, Heterogeneous systems, nVidia, nVidia GeForce GTX 460, Optimization, Performance, Thesis

October 22, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org