https://hgpu.org/?p=10641
Compiler Optimizations for SIMD/GPU/Multicore Architectures