https://hgpu.org/?p=9297
Split tiling for GPUs: automatic parallelization using trapezoidal tiles