https://hgpu.org/?p=6888
Generating, Optimizing, and Scheduling a Compiler Level Representation of Stream Parallelism