SLATE port to AMD and Intel platforms
Innovative Computing Laboratory
SLATE Working Notes, no. 16, ICL-UT-21-01, 2021
@article{abdelfattah2021slate,
title={SLATE$}$portto${$AMD$}$and${$Intel$}$platforms},
author={Abdelfattah, Ahmad and Al Farhan, Mohammed and Brown, Cade and Gates, Mark},
year={2021},
publisher={Innovative Computing Laboratory, University of Tennessee}
}
SLATE implements GPU-accelerated linear algebra, relying primarily on vendor-provided GPU BLAS for performance, in particular batched BLAS routines. Initially, SLATE was written using NVIDIA’s CUDA and cuBLAS for GPU acceleration. At the time that the SLATE project was started, it was unclear what GPU technologies would exist for other platforms [1]. Since then, AMD has developed their ROCm platform, which includes the HIP portability layer, and Intel has developed their oneAPI platform based on SYCL, a C++ descendant of OpenCL. These software stacks will be used on upcoming exascale systems: Frontier, built by AMD for Oak Ridge National Laboratory, and Aurora, built by Intel for Argonne National Laboratory. Therefore, to support these new platforms, SLATE needed to be ported to both ROCm and Intel oneMKL.
April 18, 2021 by hgpu