https://hgpu.org/?p=1066
CUDA-Lite: Reducing GPU programming complexity