29481

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

Matteo Croci, Garth N. Wells
BCAM, Basque Center for Applied Mathematics, Bilbao, Spain & Ikerbasque, Basque Foundation for Science, Bilbao, Spain
arXiv:2410.12614 [math.NA], (16 Oct 2024)

@misc{croci2024mixedprecisionfiniteelementkernels,

   title={Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration},

   author={M. Croci and G. N. Wells},

   year={2024},

   eprint={2410.12614},

   archivePrefix={arXiv},

   primaryClass={math.NA},

   url={https://arxiv.org/abs/2410.12614}

}

In this paper we develop the first fine-grained rounding error analysis of finite element (FE) cell kernels and assembly. The theory includes mixed-precision implementations and accounts for hardware-acceleration via matrix multiplication units, thus providing theoretical guidance for designing reduced- and mixed-precision FE algorithms on CPUs and GPUs. Guided by this analysis, we introduce hardware-accelerated mixed-precision implementation strategies which are provably robust to low-precision computations. Indeed, these algorithms are accurate to the lower-precision unit roundoff with an error constant that is independent from: the conditioning of FE basis function evaluations, the ill-posedness of the cell, the polynomial degree, and the number of quadrature nodes. Consequently, we present the first AMX-accelerated FE kernel implementations on Intel Sapphire Rapids CPUs. Numerical experiments demonstrate that the proposed mixed- (single/half-) precision algorithms are up to 60 times faster than their double precision equivalent while being orders of magnitude more accurate than their fully half-precision counterparts.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: