high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » GPUHarbor: Testing GPU Memory Consistency at Large

GPUHarbor: Testing GPU Memory Consistency at Large

Reese Levine, Mingun Cho, Devon McKee, Andrew Quinn, Tyler Sorensen

UC Santa Cruz, USA

Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis and Implementation, 2023

@article{levine2023gpuharbor,

title={GPUHarbor: Testing GPU Memory Consistency at Large (Experience Paper)},

author={Levine, Reese and Cho, Mingun and McKee, Devon and Quinn, Andrew and Sorensen, Tyler},

year={2023}

}

Download (PDF)

View

Source

1004

views

Memory consistency specifications (MCSs) are a difficult, yet critical, part of a concurrent programming framework. Existing MCS testing tools are not immediately accessible, and thus, they have only been applied to a limited number of platforms. However, in the post-Dennard scaling landscape, there has been an explosion of new architectures and frameworks, especially for GPUs. Studying the shared memory behaviors of different devices (across vendors and architecture generations) is important to ensure conformance and to understand the extent that devices show different behaviors. In this paper, we present GPUHarbor, a widescale GPU MCS testing tool. GPUHarbor has two interfaces: a web interface and an Android app. Using GPUHarbor, we deployed a testing campaign that checks conformance and characterizes weak behaviors. We advertised GPUHarbor on forums and social media, allowing us to collect testing data from 106 devices, spanning seven vendors. In terms of devices tested, this constitutes the largest study on weak memory behaviors by at least 10x, and our conformance tests identified two new bugs on embedded Arm and NVIDIA devices. Analyzing our characterization data yields many insights, including quantifying and comparing weak behavior occurrence rates (e.g., AMD GPUs show 25.3x more weak behaviors on average than Intel). We conclude with a discussion of the impact our results have on software development for these performance-critical devices.

Tags: AMD Radeon Pro 5500M, ATI, Computer science, Memory model, nVidia, nVidia GeForce RTX 3080, nVidia Tegra TX1, OpenCL, Performance

June 11, 2023 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

GPUHarbor: Testing GPU Memory Consistency at Large

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

GPUHarbor: Testing GPU Memory Consistency at Large

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)