high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Input Space Splitting for OpenCL

Input Space Splitting for OpenCL

Simon Moll, Johannes Doerfert, Sebastian Hack

Saarland University, Germany

Saarland University, 2016

BibTeX

Download (PDF)

View

Source

2762

views

The performance of OpenCL programs suffers from memory and control flow divergence. Therefore, OpenCL compilers employ static analyses to identify non-divergent control flow and memory accesses in order to produce faster code. However, divergence is often input-dependent, hence can be observed for some, but not all inputs. In these cases, vectorizing compilers have to generate slow code because divergence can occur at run time. In this paper, we use a polyhedral abstraction to partition the input space of an OpenCL kernel. For each partition, divergence analysis produces more precise results i.e., it can classify more code parts as non-divergent. Consequently, specializing the kernel for the input space partitions allows for generating better SIMD code because of less divergence. We implemented our technique in an OpenCL driver for the AVX instruction set and evaluate it on a range of OpenCL benchmarks. We observe speed ups of up to 9x for irregular kernels over a state-of-the-art vectorizing OpenCL driver.

Tags: Compilers, Computer science, Intel, LLVM, OpenCL

March 5, 2016 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Input Space Splitting for OpenCL

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Input Space Splitting for OpenCL

Share this:

Recent source codes

Most viewed papers (last 30 days)