https://hgpu.org/?p=6495
Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results