Interconnect Bandwidth Heterogeneity on AMD MI250x and Infinity Fabric
Sandia National Laboratories, Albuquerque, NM, USA
arXiv:2302.14827 [cs.DC], (28 Feb 2023)
@misc{https://doi.org/10.48550/arxiv.2302.14827,
doi={10.48550/ARXIV.2302.14827},
url={https://arxiv.org/abs/2302.14827},
author={Pearson, Carl},
keywords={Distributed, Parallel, and Cluster Computing (cs.DC), FOS: Computer and information sciences, FOS: Computer and information sciences},
title={Interconnect Bandwidth Heterogeneity on AMD MI250x and Infinity Fabric},
publisher={arXiv},
year={2023},
copyright={arXiv.org perpetual, non-exclusive license}
}
Demand for low-latency and high-bandwidth data transfer between GPUs has driven the development of multi-GPU nodes. Physical constraints on the manufacture and integration of such systems has yielded heterogeneous intra-node interconnects, where not all devices are connected equally. The next generation of supercomputing platforms are expected to feature AMD CPUs and GPUs. This work characterizes the extent to which interconnect heterogeneity is visible through GPU programming APIs on a system with four AMD MI250x GPUs, and provides several insights for users of such systems.
March 5, 2023 by hgpu