12398

Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor

Pawel Gepner, Victor Gamayunov, David L. Fraser, Eric Houdard, Ludovic Sauge, Damien Declat, Mathieu Dubois
Intel Corporation, Pipers Way, Swindon Wiltshire SN3 1RJ, United Kingdom
Journal of Computers, Vol. 9, No. 7, 2014

@article{gepner2014evaluation,

   title={Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor},

   author={Gepner, Pawel and Gamayunov, Victor and Fraser, David L. and Houdard, Eric and Sauge, Ludovic and Declat, Damien and Dubois, Mathieu},

   year={2014}

}

Download Download (PDF)   View View   Source Source   

1767

views

In this paper we will present a detailed study of implementing double-precision matrix-matrix multiplication (DGEMM) utilizing the Intel Xeon Phi Coprocessor. We discuss a DGEMM algorithm implementation running "natively" on the coprocessor, minimizing communication with the host CPU. We will run DGEMM across a range of matrix sizes natively as well using Intel Math Kernel Library. Our optimizations were designed to support maximal reuse of on-die cache, which significantly reduces transfer from GDDR. Finally we analyze the improvement of a classic matrix multiplication implementation based on Cauchy algorithm compared to the latest results achieved using the Intel Math Kernel Library DGEMM subroutine.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: