Accelerating QDP++/Chroma on GPUs
School of Physics and Astronomy, University of Edinburgh, Edinburgh EH9 3JZ, UK
arXiv:1111.5596v1 [hep-lat] (23 Nov 2011)
@article{2011arXiv1111.5596W,
author={Winter}, F.},
title={"{Accelerating QDP++/Chroma on GPUs}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1111.5596},
primaryClass={"hep-lat"},
keywords={High Energy Physics – Lattice, Computer Science – Distributed, Parallel, and Cluster Computing},
year={2011},
month={nov},
adsurl={http://adsabs.harvard.edu/abs/2011arXiv1111.5596W},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
Extensions to the C++ implementation of the QCD Data Parallel Interface are provided enabling acceleration of expression evaluation on NVIDIA GPUs. Single expressions are off-loaded to the device memory and execution domain leveraging the Portable Expression Template Engine and using Just-in-Time compilation techniques. Memory management is automated by a software implementation of a cache controlling the GPU’s memory. Interoperability with existing Krylov space solvers is demonstrated and special attention is paid on ‘Chroma readiness’. Non-kernel routines in lattice QCD calculations typically not subject of hand-tuned optimisations are accelerated which can reduce the effects otherwise suffered from Amdahl’s Law.
November 24, 2011 by hgpu