high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Adaptive implementation selection in the SkePU skeleton programming library

Adaptive implementation selection in the SkePU skeleton programming library

Usman Dastgeer, Lu Li, Christoph Kessler

IDA, Linkoping University, 58183 Linkoping, Sweden

Biennial Conference on Advanced Parallel Processing Technology (APPT-2013), 2013

BibTeX

Download (PDF)

View

Source

Source codes

Package:

SkePU 1.0

1957

views

In earlier work, we have developed the SkePU skeleton programming library for modern multicore systems equipped with one or more programmable GPUs. The library internally provides four types of implementations (implementation variants) for each skeleton: serial C++, OpenMP, CUDA and OpenCL targeting either CPU or GPU execution respectively. Deciding which implementation would run faster for a given skeleton call depends upon the computation, problem size(s), system architecture and data locality. In this paper, we present our work on automatic selection between these implementation variants by an offline machine learning method which generates a compact decision tree with low training overhead. The proposed selection mechanism is flexible yet high-level allowing a skeleton programmer to control different training choices at a higher abstraction level.We have evaluated our optimization strategy with 9 applications/kernels ported to our skeleton library and achieve on average more than 94% (90%) accuracy with just 0.53% (0.58%) training space exploration on two systems. Moreover, we discuss one application scenario where local optimization considering a single skeleton call can prove sub-optimal, and propose a heuristic for bulk implementation selection considering more than one skeleton call to address such application scenarios.

Tags: Computer science, CUDA, Machine learning, nVidia, OpenCL, Package, Performance, Tesla C1060, Tesla C2050

November 19, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Adaptive implementation selection in the SkePU skeleton programming library

Package:

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Adaptive implementation selection in the SkePU skeleton programming library

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)