high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A model of dynamic compilation for heterogeneous compute platforms

A model of dynamic compilation for heterogeneous compute platforms

Andrew Kerr

Georgia Institute of Technology

Georgia Institute of Technology, 2013

BibTeX

Download (PDF)

View

Source

2274

views

Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity. The rise of parallelism adds an additional dimension to the challenge of portability, as different processors support different notions of parallelism, whether vector parallelism executing in a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, software experiences obstacles to portability and efficient execution beyond differences in instruction sets; rather, the underlying execution models of radically different architectures may not be compatible. Dynamic compilation applied to data-parallel heterogeneous architectures presents an abstraction layer decoupling program representations from optimized binaries, thus enabling portability without encumbering performance. This dissertation proposes several techniques that extend dynamic compilation to data-parallel execution models. These contributions include: – characterization of data-parallel workloads; – machine-independent application metrics; – framework for performance modeling and prediction; – execution model translation for vector processors; – region-based compilation and scheduling. We evaluate these claims via the development of a novel dynamic compilation framework, GPU Ocelot, with which we execute real-world workloads from GPU computing. This enables the execution of GPU computing workloads to run efficiently on multicore CPUs, GPUs, and a functional simulator. We show data-parallel workloads exhibit performance scaling, take advantage of vector instruction set extensions, and effectively exploit data locality via scheduling which attempts to maximize control locality.

Tags: Computer science, CUDA, Heterogeneous systems, nVidia, nVidia GeForce 9800 GX2, nVidia GeForce GTX 280, OpenCL, PTX, Thesis

June 29, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A model of dynamic compilation for heterogeneous compute platforms

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

A model of dynamic compilation for heterogeneous compute platforms

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)