high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs

Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs

Jacob Wahlgren, Gabin Schieffer, Ruimin Shi, Edgar A. León, Roger Pearce, Maya Gokhale, Ivy Peng

KTH Royal Institute of Technology, Sweden

arXiv:2508.12743 [cs.DC], (18 Aug 2025)

DOI:10.48550/arXiv.2508.12743

@misc{wahlgren2025dissectingcpugpuunifiedphysical,

title={Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs},

author={Jacob Wahlgren and Gabin Schieffer and Ruimin Shi and Edgar A. León and Roger Pearce and Maya Gokhale and Ivy Peng},

year={2025},

eprint={2508.12743},

archivePrefix={arXiv},

primaryClass={cs.DC},

url={https://arxiv.org/abs/2508.12743}

}

Download (PDF)

View

Source

Source codes

Package:

Benchmarks for Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs

9809

views

Discrete GPUs are a cornerstone of HPC and data center systems, requiring management of separate CPU and GPU memory spaces. Unified Virtual Memory (UVM) has been proposed to ease the burden of memory management; however, at a high cost in performance. The recent introduction of AMD’s MI300A Accelerated Processing Units (APUs)–as deployed in the El Capitan supercomputer–enables HPC systems featuring integrated CPU and GPU with Unified Physical Memory (UPM) for the first time. This work presents the first comprehensive characterization of the UPM architecture on MI300A. We first analyze the UPM system properties, including memory latency, bandwidth, and coherence overhead. We then assess the efficiency of the system software in memory allocation, page fault handling, TLB management, and Infinity Cache utilization. We propose a set of porting strategies for transforming applications for the UPM architecture and evaluate six applications on the MI300A APU. Our results show that applications on UPM using the unified memory model can match or outperform those in the explicitly managed model–while reducing memory costs by up to 44%.

Tags: AMD Radeon Instinct MI300A, ATI, Benchmarking, Computer science, HIP, Memory model, Package

August 31, 2025 by hgpu

No votes yet.

Please wait...