8694

Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs

Mario Schrock, Hannes Vogt
Institut fur Physik, FB Theoretische Physik, Universitat Graz, 8010 Graz, Austria
arXiv:1212.5221 [hep-lat] (20 Dec 2012)

@article{2012arXiv1212.5221S,

   author={Schr{"o}ck}, M. and {Vogt}, H.},

   title={"{Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs}"},

   journal={ArXiv e-prints},

   archivePrefix={"arXiv"},

   eprint={1212.5221},

   primaryClass={"hep-lat"},

   keywords={High Energy Physics – Lattice},

   year={2012},

   month={dec},

   adsurl={http://adsabs.harvard.edu/abs/2012arXiv1212.5221S},

   adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download Download (PDF)   View View   Source Source   

977

views

A lattice gauge theory framework for simulations on graphic processing units (GPUs) using NVIDIA’s CUDA is presented. The code comprises template classes that take care of an optimal data pattern to ensure coalesced reading from device memory to achieve maximum performance. In this work we concentrate on applications for lattice gauge fixing in 3+1 dimensional SU(3) lattice gauge field theories. We employ the overrelaxation, stochastic relaxation and simulated annealing algorithms which are perfectly suited to be accelerated by highly parallel architectures like GPUs. The applications support the Coulomb, Landau and maximally Abelian gauges. Moreover, we explore the evolution of the numerical accuracy of the SU(3) valued degrees of freedom over the runtime of the algorithms in single (SP) and double precision (DP). Therefrom we draw conclusions on the reliability of SP and DP simulations and suggest a mixed precision scheme that performs the critical parts of the algorithm in full DP while retaining 80-90% of the SP performance. Finally, multi-GPUs are adopted to overcome the memory constraint of single GPUs. A communicator class which hides the MPI data exchange at the boundaries of the lattice domains, via the low bandwidth PCI-Bus, effectively behind calculations in the inner part of the domain is presented. Linear scaling using 16 NVIDIA Tesla C2070 devices and a maximum performance of 3.5 Teraflops on lattices of size down to 64^3 x 256 is demonstrated.
No votes yet.
Please wait...

* * *

* * *

Featured events

2018
November
27-30
Hida Takayama, Japan

The Third International Workshop on GPU Computing and AI (GCA), 2018

2018
September
19-21
Nagoya University, Japan

The 5th International Conference on Power and Energy Systems Engineering (CPESE), 2018

2018
September
22-24
MediaCityUK, Salford Quays, Greater Manchester, England

The 10th International Conference on Information Management and Engineering (ICIME), 2018

2018
August
21-23
No. 1037, Luoyu Road, Hongshan District, Wuhan, China

The 4th International Conference on Control Science and Systems Engineering (ICCSSE), 2018

2018
October
29-31
Nanyang Executive Centre in Nanyang Technological University, Singapore

The 2018 International Conference on Cloud Computing and Internet of Things (CCIOT’18), 2018

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: