https://hgpu.org/?p=5734
Translating GPU binaries to tiered SIMD architectures with Ocelot