https://hgpu.org/?p=8662
Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages