https://hgpu.org/?p=5249
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations