17591

Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System

Taisuke Boku, Ishikawa Ken-Ichi, Yoshinobu Kuramashi, Lawrence Meadows
Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Ibaraki 305-8573, Japan
arXiv:1709.08785 [physics.comp-ph], (26 Sep 2017)

@article{boku2017mixed,

   title={Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System},

   author={Boku, Taisuke and Ken-Ichi, Ishikawa and Kuramashi, Yoshinobu and Meadows, Lawrence},

   year={2017},

   month={sep},

   archivePrefix={"arXiv"},

   primaryClass={physics.comp-ph}

}

Lattice Quantum Chromodynamics (Lattice QCD) is a quantum field theory on a finite discretized space-time box so as to numerically compute the dynamics of quarks and gluons to explore the nature of subatomic world. Solving the equation of motion of quarks (quark solver) is the most compute-intensive part of the lattice QCD simulations and is one of the legacy HPC applications. We have developed a mixed-precision quark solver for a large Intel Xeon Phi (KNL) system named "Oakforest-PACS", employing the O(a)-improved Wilson quarks as the discretized equation of motion. The nested-BiCGSTab algorithm for the solver was implemented and optimized using mixed-precision, communication-computation overlapping with MPI-offloading, SIMD vectorization, and thread stealing techniques. The solver achieved 2.6 PFLOPS in the single-precision part on a 400^3 * 800 lattice using 16000 MPI processes on 8000 nodes on the system.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: