GPU-accelerated generation of correctly-rounded elementary functions
UPMC Univ Paris 06 and CNRS UMR 7606, LIP6
UPMC University, 2013
@article{fortin2013gpu,
title={GPU-accelerated generation of correctly-rounded elementary functions},
author={Fortin, Pierre and Gouicem, Mourad and Graillat, Stef},
year={2013}
}
The IEEE 754-2008 standard recommends the correct rounding of elementary functions. This requires to solve the Table Maker’s Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lefevre algorithm, on Graphics Processing Units (GPU) which are massively parallel architectures with a partial SIMD execution (Single Instruction Multiple Data). We first propose an analysis of the Lefevre hard-to-round argument search using the concept of continued fractions. We then propose a new parallel search algorithm much more efficient on GPU thanks to its more regular control flow. We also present an efficient hybrid CPU-GPU deployment of the generation of polynomial approximations required in Lefevre algorithm. In the end, we manage to obtain overall speedups up to 53.4x on one GPU over a sequential CPU execution, and up to 7.1x over a multi-core CPU.
June 2, 2013 by hgpu