Simultaneous floating-point sine and cosine for VLIW integer processors
INRIA Laboratoire LIP (CNRS, ENSL, INRIA, UCBL), Universite de Lyon, France
hal-00672327, 2012
@article{jeannerod2012simultaneous,
title={Simultaneous floating-point sine and cosine for VLIW integer processors},
author={Jeannerod, C.P. and Jourdan-Lu, J. and others},
year={2012}
}
Graphics and signal processing applications often require that sines and cosines be evaluated at a same floating-point argument, and in such cases a very fast computation of the pair of values is desirable. This paper studies how 32-bit VLIW integer architectures can be exploited in order to perform this task accurately for IEEE single precision. We describe software implementations for sinf, cosf, and sincosf over [-pi/4,pi/4] that have a proven 1-ulp accuracy and whose latency on STMicroelectronics’ ST231 VLIW integer processor is 19, 18, and 19 cycles, respectively. Such performances are obtained by introducing a novel algorithm for simultaneous sine and cosine that combines univariate and bivariate polynomial evaluation schemes.
February 27, 2012 by hgpu