Sieve: Stratified GPU-Compute Workload Sampling

hgpu.org » Applications » Computer science » Sieve: Stratified GPU-Compute Workload Sampling

Sieve: Stratified GPU-Compute Workload Sampling

Mahmood Naderan-Tahan, Hossein SeyyedAghaei, Lieven Eeckhout

Ghent University, Belgium

IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2023

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Sieve: Stratified GPU-Compute Workload Sampling

788

views

To exploit the ever increasing compute capabilities offered by GPU hardware, GPU-compute workloads have evolved from simple computational kernels to large-scale programs with complex software stacks and numerous kernels. Driving architecture exploration using real workloads hence becomes increasingly challenging, up to the point of becoming intractable because of extremely long simulation times using existing architecture simulators. Sampling is a widely used technique to speed up simulation, however, the state-of-the-art sampling method for GPU-compute workloads, Principal Kernel Selection (PKS), falls short for challenging GPU-compute workloads with a large number of kernels and kernel invocations. This paper presents Sieve, an accurate and low-overhead stratified sampling methodology for GPU-compute workloads that groups kernel invocations based on their instruction count, with the goal of minimizing the execution time variability within strata. For the challenging Cactus and MLPerf workloads, we report that Sieve achieves an average prediction error of 1.2% (and at most 3.2%) versus 16.5% (and up to 60.4%) for PKS on real hardware (Nvidia Ampere GPU), while maintaining a similar simulation speedup of three orders of magnitude. We further demonstrate that Sieve reduces profiling time by a factor of 8x (and up to 98x) compared to PKS.

Tags: Computer science, CUDA, nVidia, nVidia GeForce RTX 2080 Ti, nVidia GeForce RTX 3080, Package, Performance

August 28, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org