https://hgpu.org/?p=11283
Improvement of the fused CUDA kernels performance prediction