Extended-precision floating-point numbers for GPU computation

Andrew Thall
Alma College
In SIGGRAPH ’06: ACM SIGGRAPH 2006 Research posters (2006)


   title={Extended-precision floating-point numbers for GPU computation},

   author={Thall, A.},

   booktitle={ACM SIGGRAPH 2006 Research posters},






Download Download (PDF)   View View   Source Source   



Double-float (df64) and quad-float (qf128) numeric types can be implemented on current GPU hardware and used efficiently and effectively for extended-precision computational arithmetic. Using unevaluated sums of paired or quadrupled f32 single-precision values, these numeric types provide approximately 48 and 96 bits of mantissa respectively at single-precision exponent ranges for computer graphics, numerical, and general-purpose GPU programming. This paper surveys current art, presents algorithms and Cg implementation for arithmetic, exponential and trigonometric functions, and presents data on numerical accuracy on several different GPUs. It concludes with an in-depth discussion of the application of extended precision primitives to performing fast Fourier transforms on the GPU for real and complex data. [Addendum (July 2009): the presence of IEEE compliant double-precision hardware in modern GPUs from NVidia and other manufacturers has reduced the need for these techniques. The double-precision capabilities can be accessed using CUDA or other GPGPU software, but are not (as of this writing) exposed in the graphics pipeline for use in Cg-based shader code. Shader writers or those still using a graphics API for their numerical computing may still find the methods described herein to be of interest.]
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: