https://hgpu.org/?p=17538
Data Layout Oriented Compilation Techniques in Vectorization for Multi-/Many-cores