high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Wanted: Floating-Point Add Round-off Error instruction

Wanted: Floating-Point Add Round-off Error instruction

Marat Dukhan, Richard Vuduc, Jason Riedy

School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology, Atlanta, GA

arXiv:1603.00491 [cs.NA], (1 Mar 2016)

@article{dukhan2016wanted,

title={Wanted: Floating-Point Add Round-off Error instruction},

author={Dukhan, Marat and Vuduc, Richard and Riedy, Jason},

year={2016},

month={mar},

archivePrefix={"arXiv"},

primaryClass={cs.NA}

}

Download (PDF)

View

Source

Source codes

Package:

FPplus: Scientific library for high-precision computations and research

2215

views

We propose a new instruction (FPADDRE) that computes the round-off error in floating-point addition. We explain how this instruction benefits high-precision arithmetic operations in applications where double precision is not sufficient. Performance estimates on Intel Haswell, Intel Skylake, and AMD Steamroller processors, as well as Intel Knights Corner co-processor, demonstrate that such an instruction would improve the latency of double-double addition by up to 55% and increase double-double addition throughput by up to 103%, with smaller, but non-negligible benefits for double-double multiplication. The new instruction delivers up to 2x speedups on three benchmarks that use high-precision floating-point arithmetic: double-double matrix-matrix multiplication, compensated dot product, and polynomial evaluation via the compensated Horner scheme.

Tags: Computer science, CUDA, Matrix multiplication, OpenCL, Package

March 25, 2016 by hgpu

No votes yet.

Please wait...