high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Parametric GPU Code Generation for Affine Loop Programs

Parametric GPU Code Generation for Affine Loop Programs

Athanasios Konstantinidis, Paul H. J. Kelly, J. Ramanujam, P. Sadayappan

Imperial College London

The 26th International Workshop on Languages and Compilers for Parallel Computing (LCPC), 2013

@article{konstantinidis2013parametric,

title={Parametric GPU Code Generation for Affine Loop Programs},

author={Konstantinidis, Athanasios and Kelly, Paul HJ and Ramanujam, J and Sadayappan, P},

year={2013}

}

Download (PDF)

View

Source

Source codes

Package:

pTileGPU

3132

views

Partitioning a parallel computation into finitely sized chunks for effective mapping onto a parallel machine is a critical concern for source-to-source compilation. In the context of OpenCL and CUDA, this translates to the definition of a uniform hyper-rectangular partitioning of the parallel execution space where each partition is subject to a fine-grained distribution of resources that has a direct yet hard to estimate impact on performance. This paper develops the first compilation scheme for generating parametrically tiled codes for affine loop programs on GPUs which facilitates run-time exploration of partitioning parameters as a fast and portable way of finding the ones that yield maximum performance. Our approach is based on a parametric tiling scheme for producing wavefronts of parallel rectangular partitions of parametric size and a novel runtime system that manages wavefront execution and local memory usage dynamically through an inspector-executor mechanism. Our experimental evaluation demonstrates the effectiveness of our approach for wavefront as well as rectangularly-parallel partitionings.

Tags: Code generation, Computer science, CUDA, nVidia, nVidia GeForce GT 540 M, nVidia GeForce GTX 580, OpenCL, Package, Tesla K20, Tesla M2070

October 5, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Parametric GPU Code Generation for Affine Loop Programs

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Parametric GPU Code Generation for Affine Loop Programs

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)