6284

Design of MILC Lattice QCD Application for GPU Clusters

Guochun Shi, Steven Gottlieb, Aaron Torok, Volodymyr Kindratenko
National Center for Supercomputing Applications (NCSA), University of Illinois, Urbana, IL, USA
IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2011
BibTeX

Download Download (PDF)   View View   Source Source   

1878

views

We present an implementation of the improved staggered quark action lattice QCD computation designed for execution on a GPU cluster. The parallelization strategy is based on dividing the space-time lattice along the time dimension and distributing the sub-lattices among the GPU cluster nodes. We provide a mixed-precision floating-point GPU implementation of the multi-mass conjugate gradient solver. Our single GPU implementation of the conjugate gradient solver achieves a 9x performance improvement over the highly optimized code executed on a state-of-the-art eight-core CPU node. The overall application executes almost six times faster on a GPU-enabled cluster vs. a conventional multi-core cluster. The developed code is currently used for running production QCD calculations with electromagnetic corrections.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org