https://hgpu.org/?p=12821
Parallel Primitive Optimization for GPU and Multicore