Multi-Kepler GPU vs. Multi-Intel MIC for spin systems simulations
Istituto Applicazioni Calcolo, CNR, Viale Manzoni, 30 – 00185 Rome, Italy
PRACE, 2014
@article{bernaschi2014multi,
title={Multi-Kepler GPU vs. Multi-Intel MIC for spin systems simulations},
author={Bernaschi, M. and Bisson, M. and Salvadore, F.},
year={2014}
}
We present and compare the performances of two many-core architectures: the Nvidia Kepler and the Intel MIC both in a single system and in cluster configuration for the simulation of spin systems. As a benchmark we consider the time required to update a single spin of the 3D Heisenberg spin glass model by using the Over-relaxation algorithm. We present data also for a traditional high-end multi core architecture: the Intel Sandy Bridge. The results show that although on the two Intel architectures it is possible to use basically the same code, the performances of a Intel MIC change dramatically depending on (apparently) minor details. Another issue is that to obtain a reasonable scalability with the Intel Phi coprocessor (Phi is the coprocessor that implements the MIC architecture) in cluster configuration it is necessary to use the so-called offload mode which reduces the performances of the single system. As to the GPU, the Kepler architecture offers a clear advantage with respect to the previous Fermi architecture maintaining exactly the same source code. Scalability of the multi-GPU implementation remains very good by using the CPU as a communication co-processor of the GPU. All source codes are provided for inspection and for double-checking the results.
February 14, 2014 by hgpu