https://hgpu.org/?p=12666
Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware