A computing origami: Optimized code generation for emerging parallel platforms

hgpu.org » Applications » Computer science » A computing origami: Optimized code generation for emerging parallel platforms

A computing origami: Optimized code generation for emerging parallel platforms

Hagiescu Miriste Andrei Mihai

Politehnica University of Bucharest, Romania

Department of Computer Science, National University of Singapore, 2011

BibTeX

Download (PDF)

View

Source

2225

views

This thesis deals with code generation for parallel applications on emerging platforms, in particular FPGA and GPU-based platforms. These platforms expose a large design space, throughout which performance is affected by significant architectural idiosyncrasies. In this context, generating efficient code is a global optimization problem. The code generation methods described in this thesis apply to applications which expose a flexible parallel structure that is not bound to the target platform. The application is restructured in a way which can be intuitively visualized as Origami (the Japanese art of paper folding). The thesis makes three significant contributions: (1) It provides code generation methods starting from a general stream processing language (StreamIt) for both FPGA and GPU platforms. (2) It describes how the code generation methods can be extended beyond streaming applications to finer-grained parallel computation. On FPGAs, this is illustrated by a method that generates configurable floating-point SIMD coprocessors for vectorizable code. On GPUs, the method is extended to applications which expose fine-grained parallel code accompanied by a significant amount of read sharing. (3) It shows how these methods can be used on a platform which consists of multiple GPU devices connected to a host CPU. The methods can be applied to a broad range of applications. They go beyond mapping and provide tightly integrated code generation tools that handle together high-level mapping, code rewriting, optimizations and modular compilation. These methods target FPGA and GPU platforms without requiring user-added annotations. The results indicate the efficiency of the methods described.

Tags: Code generation, Computer science, CUDA, FPGA, nVidia, nVidia GeForce 8800 GTX, Optimization, Tesla S1070, Thesis

March 29, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org