high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Mascar: Speeding up GPU Warps by Reducing Memory Pitstops

Mascar: Speeding up GPU Warps by Reducing Memory Pitstops

Ankit Sethia, D. Anoushe Jamshidi, Scott Mahlke

Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, MI

21st IEEE Symposium on High Performance Computer Architecture (HPCA), 2015

@article{sethia2015mascar,

title={Mascar: Speeding up GPU Warps by Reducing Memory Pitstops},

author={Sethia, Ankit and Jamshidi, D Anoushe and Mahlke, Scott},

year={2015}

}

Download (PDF)

View

Source

1773

views

With the prevalence of GPUs as throughput engines for data parallel workloads, the landscape of GPU computing is changing significantly. Non-graphics workloads with high memory intensity and irregular access patterns are frequently targeted for acceleration on GPUs. While GPUs provide large numbers of compute resources, the resources needed for memory intensive workloads are more scarce. Therefore, managing access to these limited memory resources is a challenge for GPUs. We propose a novel Memory Aware Scheduling and Cache Access Re-execution (Mascar) system on GPUs tailored for better performance for memory intensive workloads. This scheme detects memory saturation and prioritizes memory requests among warps to enable better overlapping of compute and memory accesses. Furthermore, it enables limited re-execution of memory instructions to eliminate structural hazards in the memory subsystem and take advantage of cache locality in cases where requests cannot be sent to the memory due to memory saturation. Our results show that Mascar provides a 34% speedup over the baseline roundrobin scheduler and 10% speedup over the state of the art warp schedulers for memory intensive workloads. Mascar also achieves an average of 12% savings in energy for such workloads.

Tags: Computer science, GPGPU-sim, Hardware Architecture, nVidia, nVidia GeForce GTX 480

February 1, 2015 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Mascar: Speeding up GPU Warps by Reducing Memory Pitstops

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

Mascar: Speeding up GPU Warps by Reducing Memory Pitstops

Share this:

Recent source codes

Most viewed papers (last 30 days)