https://hgpu.org/?p=28320
Implementation Techniques for SPMD Kernels on CPUs