https://hgpu.org/?p=27446
Enabling Data Movement and Computation Pipelining in Deep Learning Compiler