high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions

Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions

Yuri Torres, Arturo Gonzalez-Escribano, Diego R. Llanos

Departamento de Informatica, Edif. Tecn. de la Informacion, Universidad de Valladolid, Campus Miguel Delibes, 47011 Valladolid, Spain

The 19th International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’13), 2013

BibTeX

Download (PDF)

View

Source

2070

views

Tools that aim to automatically map parallel computations to heterogeneous and hierarchical systems try to divide the whole computation in parts with computational loads adjusted to the capabilities of the target devices. Some parts are executed in node cores, while others are executed in accelerator devices. Each part requires one or more data-structure pieces that should be allocated in the device memory during the computation. In this paper we present a model that allows such automatic mapping tools to transparently assign computations to heterogeneous devices with different memory size restrictions. The model requires the programmer to specify the access patterns of the computation threads in a simple abstract form. This information is used at run-time to determine the second-level partition of the computation assigned to a device, ensuring that the data pieces required by each sub-part fit in the target device memory, and that the number of kernels launched is minimal. We present experimental results with a prototype implementation of the model that works for regular polyhedral expressions. We show how it works for different example applications and access patterns, transparently executing big computations in devices with different memory size restrictions.

Tags: Computer science, CUDA, Heterogeneous systems, Memory model, nVidia, nVidia GeForce GTX 680

October 12, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)