https://hgpu.org/?p=6241
Improving GPU Performance via Large Warps and Two-Level Warp Scheduling