https://hgpu.org/?p=2911
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs