https://hgpu.org/?p=11924
Toward optimised skeletons for heterogeneous parallel architecture with performance cost model