7533

Efficient Intranode Communication in GPU-Accelerated Systems

Feng Ji, Ashwin M. Aji, James Dinan, Darius Buntinas, Pavan Balaji, Wu-chun Feng, Xiaosong Ma
Department of Computer Science, North Carolina State University
2nd IEEE International Workshop on Accelerators and Hybrid Exascale Systems (in conjunction with the 26th IEEE International Parallel and Distributed Processing Symposium), 2012

@InProceedings{aji-intranode-comm-ashes12,

   author={Ji, Feng and Aji, Ashwin and Dinan, James and Buntinas, Darius and Balaji, Pavan and Feng, Wu-chun and Ma, Xiaosong},

   title={"{Efficient Intranode Communication in GPU-Accelerated Systems}"},

   booktitle={2nd IEEE International Workshop on Accelerators and Hybrid Exascale Systems (in conjunction with the 26th IEEE International Parallel and Distributed Processing Symposium)},

   address={Shanghai, China},

   month={May},

   year={2012}

}

Download Download (PDF)   View View   Source Source   

1436

views

Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: