Performance portability analysis of SYCL with a classical CG on CPU, GPU, and FPGA
Institute of Parallel and Distributed Systems, University of Stuttgart, Universitätsstraße 38, D–70569 Stuttgart
University of Stuttgart, 2023
@mastersthesis{franquinet2023performance,
title={Performance portability analysis of SYCL with a classical CG on CPU, GPU, and FPGA},
author={Franquinet, Julian},
year={2023}
}
In this work, the capability of SYCL™ to execute code on different hardware devices is investigated. This motivates conducting a performance portability analysis. The architectures investigated are the CPU, GPU, and FPGA. As a benchmark algorithm, the CG algorithm is used, as it is widely applicable to many fields and is more complex than simple matrix-vector multiplications. To generate reference results on the different devices, OpenMP and CUDA are used. The CG is also implemented using highly optimized libraries. These libraries are based on the BLAS standard. The results show a significant increase in performance when using the libraries on the GPU for growing problem sizes. Regarding the CPU, the optimizations are more significant for smaller problem sizes. So far, optimized libraries for the FPGA do not exist and therefore are not investigated. As a result, the performance of the FPGA is not as good as on the CPU and GPU. This is why the portability performance analysis results in rather low performance portability. However, the results show that SYCL™ is capable of executing code on various hardware devices, making it a promising standard for future applications.
October 22, 2023 by hgpu