high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Developing a High Performance GPGPU Compiler Using Cetus

Developing a High Performance GPGPU Compiler Using Cetus

Yi Yang, Huiyang Zhou

North Carolina State University

Cetus Users and Compiler Infrastructure Workshop, International Conference on Parallel Architectures and Compilation Techniques (PACT’11), 2011

@article{yang2011developing,

title={Developing a High Performance GPGPU Compiler Using Cetus},

author={Yang, Yi and Zhou, H.},

booktitle={Cetus Users and Compiler Infrastructure Workshop, International Conference on Parallel Architectures and Compilation Techniques (PACT’11)},

year={2011}

}

Download (PDF)

View

Source

Source codes

Package:

gpgpucompiler

1812

views

In this paper we present our experience in developing an optimizing compiler for general purpose computation on graphics processing units (GPGPU) based on the Cetus compiler framework. The input to our compiler is a naive GPU kernel procedure, which is functionally correct but without any consideration for performance optimization. Our compiler applies a set of optimization techniques to the naive kernel and generates the optimized GPU kernel. The implementation of our compiler is facilitated with the Cetus infrastructure. The code transformation in the Cetus compiler framework is called a pass. We classify all the passes used in our work into two categories: functional passes and optimization passes. The functional passes translate input kernels into desired intermediate representation, which can clearly represent memory access patterns and thread configurations. The CUDA language support pass is derived from MCUDA. A series of optimization passes improve the performance of the kernels by adapting the kernels to the GPGPU architecture. Our experiments show that the optimized code achieves very high performance, either superior or very close to highly fine-tuned libraries.

Tags: ATI, ATI Radeon HD 5850, ATI Radeon HD 5870, Code generation, Compilers, Computer science, CUDA, nVidia, nVidia GeForce GTX 480, Optimization

October 13, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Developing a High Performance GPGPU Compiler Using Cetus

Package:

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Developing a High Performance GPGPU Compiler Using Cetus

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)