https://hgpu.org/?p=6366
Building-Blocks for Performance Oriented DSLs