https://hgpu.org/?p=15101
A CUDA Kernel Scheduler Exploiting Static Data Dependencies