DMA-Assisted, Intranode Communication in GPU Accelerated Systems
Department of Computer Science, North Carolina State University
14th IEEE International Conference on High Performance Computing and Communications (HPCC), 2012
@article{ji2012dma,
title={DMA-Assisted, Intranode Communication in GPU Accelerated Systems},
author={Ji, F. and Aji, A.M. and Dinan, J. and Buntinas, D. and Balaji, P. and Thakur, R. and Feng, W. and Ma, X.},
year={2012}
}
Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.
June 6, 2012 by hgpu