An Optimized Multiple Right-Hand Side Dslash Kernel for Intel Xeon Phi

Aaron Walden
Old Dominion University
Old Dominion University, 2016


   title={An Optimized Multiple Right-Hand Side Dslash Kernel for Intel Xeon Phi},

   author={Walden, Aaron},



Download Download (PDF)   View View   Source Source   



Lattice quantum chromodynamics (LQCD) stands unique as the only computationally tractable, non-perturbative, and model-independent quantum field theory of the strong nuclear force. The computational core of LQCD is the Wilson Dslash operator, a nearest neighbor stencil operator summing matrix-vector multiplications over lattice points, whose performance is bandwidth-bound on most architectures. Reportedly, up to 90% of LQCD running time may be spent computing Dslash. In recent years, efforts have been made by researchers to optimize LQCD calculations for floating point coprocessor cards such as GPUs and Intel Xeon Phi Knights Corner (KNC), which boast powerful vector processing units. Most of these efforts in the area of Dslash have focused on single right-hand side solvers. This thesis will present two optimized Dslash kernels which simplify vectorization using multiple right-hand sides and traverse lattices using novel methods. The speedups resulting from these approaches will be explored in the context of KNC’s architecture.
No votes yet.
Please wait...

* * *

* * *

* * *

HGPU group © 2010-2022 hgpu.org

All rights belong to the respective authors

Contact us: