https://hgpu.org/?p=1119
Optimal loop unrolling for GPGPU programs