Optimizing GPU to GPU Communication on Cray XK7
NVIDIA, Santa Clara, CA, USA
A New Vintage of Computing (CUG2013), 2013
@article{larkin2013optimizing,
title={Optimizing GPU to GPU Communication on Cray XK7},
author={Larkin, Jeff M},
year={2013}
}
When developing an application for Cray XK7 systems, optimization of compute kernels is only a small part of maximizing scaling and performance. Programmers must consider the effect of the GPU’s distinct address space and the PCIe bus on application scalability. Without such considerations applications rapidly become limited by transfers to and from the GPU and fail to scale to large numbers of nodes. This paper will demonstrate methods for optimizing GPU to GPU communication and present XK7 results for these methods.
December 19, 2013 by hgpu