https://hgpu.org/?p=3212
A framework for efficient and scalable execution of domain-specific templates on GPUs