https://hgpu.org/?p=1647
Designing and optimizing compute kernels on NVIDIA GPUs