18273

Implementing general matrix-matrix multiplication algorithm on the Intel Xeon Phi Knights Landing Processor

Raehyun Kim
Department of Mathematical Sciences, Seoul National University
Seoul National University, 2018

@phdthesis{kim2018implementing,

   title={Implementing general matrix-matrix multiplication algorithm on the Intel Xeon Phi Knights Landing Processor},

   author={Kim, Raehyun},

   year={2018}

}

Download Download (PDF)   View View   Source Source   

1859

views

This paper presents the design and implementation of general matrix-matrix multiplication (GEMM) algorithm for the second generation Intel Xeon Phi processor codenamed Knights Landing (KNL). We illustrate several developing guidelines to achieve optimal performance with C programming language and the Advanced Vector Extensions (AVX-512) instruction set. Further, we present several environment variable issues associated with parallelization on the KNL. On a single core of the KNL, our double-precision GEMM (DGEMM) implementation achieves up to 99 percent of DGEMM performance using the Intel MKL, which is the current state-of-the-art library. Our parallel implementation for 68 cores of the KNL also achieves good scaling results, up to 93 percent of DGEMM performance using the Intel MKL.
Rating: 3.0/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: