high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Automatic C-to-CUDA Code Generation for Affine Programs

Automatic C-to-CUDA Code Generation for Affine Programs

Muthu Baskaran, J. Ramanujam, P. Sadayappan

The Ohio State University, USA

In Compiler Construction, Vol. 6011 (2010), pp. 244-263

DOI:10.1007/978-3-642-11970-5_14

BibTeX

Download (PDF)

View

Source

2741

views

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded parallel programming model, facilitating high performance implementations of general-purpose computations. However, the explicitly managed memory hierarchy and multi-level parallel view make manual development of high-performance CUDA code rather complicated. Hence the automatic transformation of sequential input programs into efficient parallel CUDA programs is of considerable interest. This paper describes an automatic code transformation system that generates parallel CUDA code from input sequential C code, for regular (affine) programs. Using and adapting publicly available tools that have made polyhedral compiler optimization practically effective, we develop a C-to-CUDA transformation system that generates two-level parallel CUDA code that is optimized for efficient data access. The performance of automatically generated code is compared with manually optimized CUDA code for a number of benchmarks. The performance of the automatically generated CUDA code is quite close to hand-optimized CUDA code and considerably better than the benchmarks’ performance on a multicore CPU.

Tags: Code generation, Compilers, Computer science, CUDA, nVidia, nVidia GeForce 8800 GTX

February 8, 2011 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Automatic C-to-CUDA Code Generation for Affine Programs

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Most viewed papers (last 30 days)

Automatic C-to-CUDA Code Generation for Affine Programs

Share this:

Recent source codes

Most viewed papers (last 30 days)