https://hgpu.org/?p=2994
A Domain-Specific Approach To Heterogeneous Parallelism