13386

Reproducible and Accurate Matrix Multiplication for GPU Accelerators

Roman Iakymchuk, David Defour, Sylvain Collange, Stef Graillat
Sorbonne Universites, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005 Paris, France
hal-01102877, (14 January 2015)

@article{iakymchuk2015reproducible,

   title={Reproducible and Accurate Matrix Multiplication for GPU Accelerators},

   author={Iakymchuk, Roman and Defour, David and Collange, Sylvain and Graillat, Stef},

   year={2015}

}

Download Download (PDF)   View View   Source Source   

650

views

Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bitwise reproducible floating-point result for multiple executions of the same code on different or even similar parallel architectures is challenging. In this paper, we address the problem of reproducibility in the context of matrix multiplication and propose an algorithm that yields both reproducible and accurate results. This algorithm is composed of two main stages: a filtering stage that uses fast vectorized floating-point expansions in con-junction with error-free transformations; an accumulation stage based on Kulisch long accumulators in a high-radix carry-save representation. Finally, we provide implementations and performance results in parallel environments like GPUs.
Rating: 2.5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: