Providing performance portable numerics for Intel GPUs
Steinbuch Centre for Computing, Karlsruhe, Institute of Technology, Karlsruhe, Baden-Württemberg, Germany
Concurrency and Computation: Practice and Experience published by John Wiley & Sons Ltd, e7400, 2022
DOI:10.1002/cpe.7400
@article{tsai2022providing,
title={Providing performance portable numerics for Intel GPUs},
author={Tsai, Yu-Hsiang M and Cojean, Terry and Anzt, Hartwig},
journal={Concurrency and Computation: Practice and Experience},
pages={e7400},
year={2022},
publisher={Wiley Online Library}
}
With discrete Intel GPUs entering the high-performance computing landscape, there is an urgent need for production-ready software stacks for these platforms. In this article, we report how we enable the Ginkgo math library to execute on Intel GPUs by developing a kernel backed based on the DPC++ programming environment. We discuss conceptual differences between the CUDA and DPC++ programming models and describe workflows for simplified code conversion. We evaluate the performance of basic and advanced sparse linear algebra routines available in Ginkgo’s DPC++ backend in the hardware-specific performance bounds and compare against routines providing the same functionality that ship with Intel’s oneMKL vendor library.
October 30, 2022 by hgpu
Your response
You must be logged in to post a comment.