high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A compiler framework for optimization of affine loop nests for gpgpus

A compiler framework for optimization of affine loop nests for gpgpus

Muthu M. Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan

Department of Computer, Science and Engg., The Ohio State University

In ICS ’08: Proceedings of the 22nd annual international conference on Supercomputing (2008), pp. 225-234

DOI:10.1145/1375527.1375562

@conference{baskaran2008compiler,

title={A compiler framework for optimization of affine loop nests for gpgpus},

author={Baskaran, M.M. and Bondhugula, U. and Krishnamoorthy, S. and Ramanujam, J. and Rountev, A. and Sadayappan, P.},

booktitle={Proceedings of the 22nd annual international conference on Supercomputing},

pages={225–234},

year={2008},

organization={ACM}

}

Download (PDF)

View

Source

2074

views

GPUs are a class of specialized parallel architectures with tremendous computational power. The new Compute Unified Device Architecture (CUDA) programming model from NVIDIA facilitates programming of general purpose applications on their GPUs. However, manual development of high-performance parallel code for GPUs is still very challenging. In this paper, a number of issues are addressed towards the goal of developing a compiler framework for automatic parallelization and performance optimization of affine loop nests on GPGPUs: 1) approach to program transformation for efficient data access from GPU global memory, using a polyhedral compiler model of data dependence abstraction and program transformation; 2) determination of optimal padding factors for conflict-minimal data access from GPU shared memory; and 3) model-driven empirical search to determine optimal parameters for unrolling and tiling. Experimental results on a number of kernels demonstrate the effectiveness of the compiler optimization approaches developed.

Tags: Computer science, CUDA, High-level Languages, nVidia, nVidia GeForce 8800 GTX, Optimization

December 12, 2010 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

A compiler framework for optimization of affine loop nests for gpgpus

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

A compiler framework for optimization of affine loop nests for gpgpus

Share this:

Recent source codes

Most viewed papers (last 30 days)