high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Improving GPU Performance Prediction with Data Transfer Modeling

Improving GPU Performance Prediction with Data Transfer Modeling

Michael Boyer, Jiayuan Meng, Kalyan Kumaran

Department of Computer Science, University of Virginia

Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), 2013

@article{boyer2013improving,

title={Improving GPU Performance Prediction with Data Transfer Modeling},

author={Boyer, Michael and Meng, Jiayuan and Kumaran, Kalyan},

year={2013}

}

Download (PDF)

View

Source

2261

views

Accelerators such as graphics processors (GPUs) have become increasingly popular for high performance scientific computing. Often, much effort is invested in creating and optimizing GPU code without any guaranteed performance benefit. To reduce this risk, performance models can be used to project a kernel’s GPU performance potential before it is ported. However, raw GPU execution time is not the only consideration. The overhead of transferring data between the CPU and the GPU is also an important factor; for some applications, this overhead may even erase the performance benefits of GPU acceleration. To address this challenge, we propose a GPU performance modeling framework that predicts both kernel execution time and data transfer time. Our extensions to an existing GPU performance model include a data usage analyzer for a sequence of GPU kernels, to determine the amount of data that needs to be transferred, and a performance model of the PCIe bus, to determine how long the data transfer will take. We have tested our framework using a set of applications running on a production machine at Argonne National Laboratory. On average, our model predicts the data transfer overhead with an error of only 8%, and the inclusion of data transfer time reduces the error in the predicted GPU speedup from 255% to 9%.

Tags: Computer science, CUDA, nVidia, nVidia Quadro FX 5600, Performance

April 6, 2013 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Improving GPU Performance Prediction with Data Transfer Modeling

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Improving GPU Performance Prediction with Data Transfer Modeling

Share this:

Recent source codes

Most viewed papers (last 30 days)