https://hgpu.org/?p=10504
Memory transfer optimization for a lattice Boltzmann solver on Kepler architecture nVidia GPUs