Scaling Lattice QCD beyond 100 GPUs
Center for Computational Science, Boston University, Boston, MA 02215, USA
arXiv:1109.2935v1 [hep-lat] (13 Sep 2011)
Over the past five years, graphics processing units (GPUs) have had a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations in nuclear and particle physics. While GPUs have been applied with great success to the post-Monte Carlo "analysis" phase which accounts for a substantial fraction of the workload in a typical LQCD calculation, the initial Monte Carlo "gauge field generation" phase requires capability-level supercomputing, corresponding to O(100) GPUs or more. Such strong scaling has not been previously achieved. In this contribution, we demonstrate that using a multi-dimensional parallelization strategy and a domain-decomposed preconditioner allows us to scale into this regime. We present results for two popular discretizations of the Dirac operator, Wilson-clover and improved staggered, employing up to 256 GPUs on the Edge cluster at Lawrence Livermore National Laboratory.
September 15, 2011 by hgpu
Your response
You must be logged in to post a comment.