OpenCL Acceleration for TensorFlow
Codeplay Software Ltd.
SysML Conference, 2018
@article{goli2018opencl,
title={OpenCL Acceleration for TensorFlow},
author={Goli, Mehdi and Iwanski, Luke and Lawson, John and Dolinsky, Uwe and Richards, Andrew},
year={2018}
}
There is huge demand for targeting complex and large-scale machine learning applications particularly those based on popular actively-maintained frameworks such as TensorFlow and CAFFE to a variety of platforms with accelerators ranging from high-end desktop GPUs to resource-constrained embedded or mobile GPUs, FPGAs, and DSPs. However, to deliver good performance different platforms may require different algorithms or data structures, yet code should be easily portable and reused as much as possible across different devices. The open SYCL standard addresses this by providing parallel processing through a single-source programming model enabling the same standard C++ code to be used on the CPU and accelerator. This allows high-level C++ abstractions and templates to be used to quickly configure device and host code to cover specific features of the platform. By targeting OpenCL, SYCL enables C++ applications such as TensorFlow to run efficiently on OpenCL devices without having to write OpenCL code.
March 3, 2018 by hgpu