26037

On the accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit and novel 16-bit number formats

Moritz Lehmann, Mathias J. Krause, Giorgio Amati, Marcello Sega, Jens Harting, Stephan Gekle
Biofluid Simulation and Modeling – Theorethische Physik VI, University of Bayreuth
arXiv:2112.08926 [physics.comp-ph], (16 Dec 2021)

@misc{lehmann2021accuracy,

   title={On the accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit and novel 16-bit number formats},

   author={Moritz Lehmann and Mathias J. Krause and Giorgio Amati and Marcello Sega and Jens Harting and Stephan Gekle},

   year={2021},

   eprint={2112.08926},

   archivePrefix={arXiv},

   primaryClass={physics.comp-ph}

}

Download Download (PDF)   View View   Source Source   

1147

views

Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory-intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here, we evaluate the possibility to use even FP16 and Posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop novel 16-bit formats – based on a modified IEEE-754 and on a modified Posit standard – that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method) and finally the impact of a raindrop (based on a Volume-of-Fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: