Providing performance portable numerics for Intel GPUs
Steinbuch Centre for Computing, Karlsruhe, Institute of Technology, Karlsruhe, Baden-Württemberg, Germany
Concurrency and Computation: Practice and Experience published by John Wiley & Sons Ltd, e7400, 2022
DOI:10.1002/cpe.7400
@article{tsai2022providing,
title={Providing performance portable numerics for Intel GPUs},
author={Tsai, Yu-Hsiang M and Cojean, Terry and Anzt, Hartwig},
journal={Concurrency and Computation: Practice and Experience},
pages={e7400},
year={2022},
publisher={Wiley Online Library}
}
With discrete Intel GPUs entering the high-performance computing landscape, there is an urgent need for production-ready software stacks for these platforms. In this article, we report how we enable the Ginkgo math library to execute on Intel GPUs by developing a kernel backed based on the DPC++ programming environment. We discuss conceptual differences between the CUDA and DPC++ programming models and describe workflows for simplified code conversion. We evaluate the performance of basic and advanced sparse linear algebra routines available in Ginkgo’s DPC++ backend in the hardware-specific performance bounds and compare against routines providing the same functionality that ship with Intel’s oneMKL vendor library.
October 30, 2022 by hgpu