high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Efficiently Using a CUDA-enabled GPU as Shared Resource

Efficiently Using a CUDA-enabled GPU as Shared Resource

Hagen Peters, Martin Koper, Norbert Luttenberge

Research Group for Communication Systems, Department of Computer Science, Christian-Albrechts-University Kiel, Germany

IEEE 10th International Conference on Computer and Information Technology (CIT), 2010

DOI:10.1109/CIT.2010.204

@conference{peters2010efficiently,

title={Efficiently using a CUDA-enabled GPU as shared resource},

author={Peters, H. and Koper, M. and Luttenberger, N.},

booktitle={Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on},

pages={1122–1127},

year={2010},

organization={IEEE}

}

Download (PDF)

View

Source

1757

views

GPGPU is getting more and more important, but when using CUDA-enabled GPUs the special characteristics of NVIDIAs SIMT architecture have to be considered. Particularly, it is not possible to run functions concurrently, although NVIDIAs GPUs consist of many processing units. Therefore, the processing power of GPUs can not be shared among processes, and for an efficient use of the GPU, it has to be fully utilized by a single function launch of a single process. In this contribution we present an approach that overcomes these restrictions. A GPGPU service launches a persistent kernel which consists of a set of device functions. The service controls kernel execution via memory transfers and provides interfaces, through that clients can access the device functions. Using this novel approach, the GPU is shared by many clients at the same time, what greatly increases the flexibility without loss in performance.

Tags: Computer science, CUDA, nVidia, nVidia GeForce GTX 260, nVidia GeForce GTX 280, Programming techniques

April 17, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Efficiently Using a CUDA-enabled GPU as Shared Resource

Your response

Recent source codes

MATLAB Tensor Core models

TritonForge: Transform PyTorch Operations into Optimized GPU Kernels with LLMs

RLTune: Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

tritonBLAS: A Lightweight Triton-based General Matrix Multiplication (GEMM) Library

hls4ml: Machine learning on FPGAs using HLS

NVIDIA Nemotron Parse 1.1

ThunderKittens: Tile primitives for speedy kernels

Iris: AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

HipKittens: Fast and Furious AMD Kernels

Fortran xDSL dialects

Most viewed papers (last 30 days)

Efficiently Using a CUDA-enabled GPU as Shared Resource

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)