29222

Enabling full-speed random access to the entire memory on the A100 GPU

Alden Walker
arXiv:2405.11425 [cs.PF], (19 May 2024)
BibTeX

Download Download (PDF)   View View   Source Source   

1277

views

We describe some features of the A100 memory architecture. In particular, we give a technique to reverse-engineer some hardware layout information. Using this information, we show how to avoid TLB issues to obtain full-speed random HBM access to the entire memory, as long as we constrain any particular thread to a reduced access window of less than 64GB.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org