30157

Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs

Jacob Wahlgren, Gabin Schieffer, Ruimin Shi, Edgar A. León, Roger Pearce, Maya Gokhale, Ivy Peng
KTH Royal Institute of Technology, Sweden
arXiv:2508.12743 [cs.DC], (18 Aug 2025)

@misc{wahlgren2025dissectingcpugpuunifiedphysical,

   title={Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs},

   author={Jacob Wahlgren and Gabin Schieffer and Ruimin Shi and Edgar A. León and Roger Pearce and Maya Gokhale and Ivy Peng},

   year={2025},

   eprint={2508.12743},

   archivePrefix={arXiv},

   primaryClass={cs.DC},

   url={https://arxiv.org/abs/2508.12743}

}

Discrete GPUs are a cornerstone of HPC and data center systems, requiring management of separate CPU and GPU memory spaces. Unified Virtual Memory (UVM) has been proposed to ease the burden of memory management; however, at a high cost in performance. The recent introduction of AMD’s MI300A Accelerated Processing Units (APUs)–as deployed in the El Capitan supercomputer–enables HPC systems featuring integrated CPU and GPU with Unified Physical Memory (UPM) for the first time. This work presents the first comprehensive characterization of the UPM architecture on MI300A. We first analyze the UPM system properties, including memory latency, bandwidth, and coherence overhead. We then assess the efficiency of the system software in memory allocation, page fault handling, TLB management, and Infinity Cache utilization. We propose a set of porting strategies for transforming applications for the UPM architecture and evaluate six applications on the MI300A APU. Our results show that applications on UPM using the unified memory model can match or outperform those in the explicitly managed model–while reducing memory costs by up to 44%.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: