9510

GPU-accelerated generation of correctly-rounded elementary functions

Pierre Fortin, Mourad Gouicem, Stef Graillat
UPMC Univ Paris 06 and CNRS UMR 7606, LIP6
UPMC University, 2013
BibTeX

Download Download (PDF)   View View   Source Source   

1779

views

The IEEE 754-2008 standard recommends the correct rounding of elementary functions. This requires to solve the Table Maker’s Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lefevre algorithm, on Graphics Processing Units (GPU) which are massively parallel architectures with a partial SIMD execution (Single Instruction Multiple Data). We first propose an analysis of the Lefevre hard-to-round argument search using the concept of continued fractions. We then propose a new parallel search algorithm much more efficient on GPU thanks to its more regular control flow. We also present an efficient hybrid CPU-GPU deployment of the generation of polynomial approximations required in Lefevre algorithm. In the end, we manage to obtain overall speedups up to 53.4x on one GPU over a sequential CPU execution, and up to 7.1x over a multi-core CPU.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org