https://hgpu.org/?p=26992
High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs