16376

Parallel experiments with RARE-BLAS

Chemseddine Chohra, Philippe Langlois, David Parello
Univ. Perpignan Via Domitia, Digits, Architectures et Logiciels Informatiques, F-66860, Perpignan
18th International Symposium on Symbolic and Numeric Algorithms, for Scientific Computing, 2016

@inproceedings{chohra:lirmm-01349698,

   title={Parallel experiments with RARE-BLAS},

   author={Chohra, Chemseddine and Langlois, Philippe and Parello, David},

   url={http://hal-lirmm.ccsd.cnrs.fr/lirmm-01349698},

   booktitle={SYNASC: Symbolic and Numeric Algorithms for Scientific Computing},

   address={Timisoara, Romania},

   year={2016},

   month={Sep},

   keywords={Numerical reproducibility; floating-point arithmetic; RARE-BLAS; BLAS},

   pdf={http://hal-lirmm.ccsd.cnrs.fr/lirmm-01349698/file/SYNASC.pdf},

   hal_id={lirmm-01349698},

   hal_version={v1}

}

Download Download (PDF)   View View   Source Source   

1653

views

Numerical reproducibility failures rise in parallel computation because of the non-associativity of floating-point summation. Optimizations on massively parallel systems dynamically modify the floating-point operation order. Hence, numerical results may change from one run to another. We propose to ensure reproducibility by extending as far as possible the IEEE-754 correct rounding property to larger operation sequences. Our RARE-BLAS (Reproducible, Accurately Rounded and Efficient BLAS) benefits from recent accurate and efficient summation algorithms. Solutions for level 1 (asum, dot and nrm2) and level 2 (gemv) routines are provided. We compare their performance to the Intel MKL library and to other existing reproducible algorithms. For both shared and distributed memory parallel systems, we exhibit an extra-cost of 2x in the worst case scenario, which is satisfying for a wide range of applications. For Intel Xeon Phi accelerator a larger extra-cost (4x to 6x) is observed, which is still helpful at least for debugging and validation.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: