high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Joshua Bakita, James H. Anderson

University of North Carolina at Chapel Hill, NC, USA

37th Euromicro Conference on Real-Time Systems (ECRTS), 2025

DOI:10.4230/LIPIcs.ECRTS.2025.21

@inproceedings{bakita2025hardware,

title={Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems},

author={Bakita, Joshua and Anderson, James H},

booktitle={37th Euromicro Conference on Real-Time Systems (ECRTS 2025)},

pages={21–1},

year={2025},

organization={Schloss Dagstuhl–Leibniz-Zentrum f{"u}r Informatik}

}

Download (PDF)

View

Source

Source codes

Package:

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

1249

views

As GPU-using tasks become more common in embedded, safety-critical systems, efficiency demands necessitate sharing a single GPU among multiple tasks. Unfortunately, existing ways to schedule multiple tasks onto a GPU often either result in a loss of ability to meet deadlines, or a loss of efficiency. In this work, we develop a system-level spatial compute partitioning mechanism for NVIDIA GPUs and demonstrate that it can be used to execute tasks efficiently without compromising timing predictability. Our tool, called nvtaskset, supports composable systems by not requiring task, driver, or hardware modifications. In our evaluation, we demonstrate sub-1-μs overheads, stronger partition enforcement, and finer-granularity partitioning when using our mechanism instead of NVIDIA’s Multi-Process Service (MPS) or Multi-instance GPU (MiG) features.

Tags: Computer science, CUDA, nVidia, nVidia GeForce RTX 4090, nVidia RTX 6000 Ada, Package, Task scheduling

July 13, 2025 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

* * *

high performance computing on graphics processing units: hgpu.org

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Package:

Your response

Recent source codes

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Vortex-Optimized Light-weight Toolchain (VOLT)

SciDef: Automated Definition Extraction from Scientific Literature

bioagent-bench: Benchmark for evaluating LLM agents in bioinformatics

Benchmark suite for LLM inference on NVIDIA consumer GPUs

Theorizer: from the paper Generating Literature-Driven Scientific Discoveries at Scale

Nsight Python: a Python kernel profiling interface based on NVIDIA Nsight Tools

Awesome LLM-Driven Kernel Generation

Most viewed papers (last 30 days)

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)