Efficient implementation of the overlap operator on multi-GPUs

Andrei Alexandru, Michael Lujan, Craig Pelissier, Ben Gamari, Frank X. Lee
Department of Physics, The George Washington University, 725 21st St. NW, Washington, DC 20052
arXiv:1106.4964v1 [hep-lat] (24 Jun 2011)


   author={Alexandru, Andrei and Lujan, Michael and Pelissier, Craig and Gamari, Ben and Lee, Frank X.},

   title={Efficient implementation of the overlap operator on multi-GPUs},







Download Download (PDF)   View View   Source Source   



Lattice QCD calculations were one of the first applications to show the potential of GPUs in the area of high performance computing. Our interest is to find ways to effectively use GPUs for lattice calculations using the overlap operator. The large memory footprint of these codes requires the use of multiple GPUs in parallel. In this paper we show the methods we used to implement this operator efficiently. We run our codes both on a GPU cluster and a CPU cluster with similar interconnects. We find that to match performance the CPU cluster requires 20-30 times more CPU cores than GPUs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: