high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » A programming language interface to describe transformations and code generation

A programming language interface to describe transformations and code generation

Gabe Rudy, Malik Murtaza Khan, Mary Hall, Chun Chen, Jacqueline Chame

School of Computing, University of Utah, Salt Lake City, UT

Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, Volume 6548/2011, 136-150, 2011

DOI:10.1007/978-3-642-19595-2_10

@article{rudy2011programming,

title={A programming language interface to describe transformations and code generation},

author={Rudy, G. and Khan, M. and Hall, M. and Chen, C. and Chame, J.},

journal={Languages and Compilers for Parallel Computing},

pages={136–150},

year={2011},

publisher={Springer}

}

Download (PDF)

View

Source

1617

views

This paper presents a programming language interface, a complete scripting language, to describe composable compiler transformations. These transformation programs can be written, shared and reused by non-expert application and library developers. From a compiler writer’s perspective, a scripting language interface permits rapid prototyping of compiler algorithms that can mix levels and compose different sequences of transformations, producing readable code as output. From a library or application developer’s perspective, the use of transformation programs permits expression of clean high-level code, and a separate description of how to map that code to architectural features, easing maintenance and porting to new architectures. We illustrate this interface in the context of CUDA-CHiLL, a source-to-source compiler transformation and code generation framework that transforms sequential loop nests to high-performance GPU code. We show how this high-level transformation and code generation language can be used to express: (1) complex transformation sequences, exemplified by a single loop restructuring construct used to generate a series of tiling and permute commands; and, (2) complex code generation sequences to produce CUDA code from a high-level specification. We demonstrate that the automatically-generated code either performs closely or outperforms two hand-tuned GPU library kernels from Nvidia’s CUBLAS 2.2 and 3.2 libraries.

Tags: Algorithms, Code generation, Computer science, CUBLAS, CUDA, MPI, nVidia, nVidia GeForce GTX 280, Tesla C2050

December 1, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

A programming language interface to describe transformations and code generation

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

A programming language interface to describe transformations and code generation

Share this:

Recent source codes

Most viewed papers (last 30 days)