https://hgpu.org/?p=26041
Optimization of Compiler-generated OpenCL CNN Kernels and Runtime for FPGAs