Enabling full-speed random access to the entire memory on the A100 GPU
arXiv:2405.11425 [cs.PF], (19 May 2024)
@misc{walker2024enabling,
title={Enabling full-speed random access to the entire memory on the A100 GPU},
author={Alden Walker},
year={2024},
eprint={2405.11425},
archivePrefix={arXiv},
primaryClass={cs.PF}
}
We describe some features of the A100 memory architecture. In particular, we give a technique to reverse-engineer some hardware layout information. Using this information, we show how to avoid TLB issues to obtain full-speed random HBM access to the entire memory, as long as we constrain any particular thread to a reduced access window of less than 64GB.
May 26, 2024 by hgpu