https://hgpu.org/?p=3502
Optimize or Wait? Using llc Fast-Prototyping Tool to Evaluate CUDA Optimizations