https://hgpu.org/?p=28823
Compiler-centric across-stack deep learning acceleration