3057

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)

Dominik Goddeke and Robert Strzodka
Applied Mathematics, Dortmund University of Technology, Germany
Fakultat fur Mathematik, Technische Universitat Dortmund, Ergebnisberichte des Instituts fur Angewandte Mathematik, Nummer 370, Aug. 2008, Technical Report

@techreport{Goeddeke:2008:PAA,

   author={Dominik G{“o}ddeke and Robert Strzodka},

   title={Performance and accuracy of hardware-oriented native{, }emulated- and mixed-precision solvers in {FEM} simulations (Part 2: Double Precision {GPUs})},

   institution={Fakult{“a}t f{“u}r {M}athematik, {T}echnische {U}niversit{“a}t {D}ortmund},

   year={2008},

   note={Ergebnisberichte des {I}nstituts f{“u}r {A}ngewandte {M}athematik, {N}ummer 370},

   month={aug}

}

Download Download (PDF)   View View   Source Source   

746

views

In a previous publication, we have examined the fundamental difference between computational precision and result accuracy in the context of the iterative solution of linear systems as they typically arise in the Finite Element discretization of Partial Differential Equations (PDEs) [1]. In particular, we evaluated mixed- and emulatedprecision schemes on commodity graphics processors (GPUs), which at that time only supported computations in single precision. With the advent of graphics cards that natively provide double precision, this report updates our previous results. We demonstrate that with new co-processor hardware supporting native double precision, such as NVIDIA’s G200 architecture, the situation does not change qualitatively for PDEs, and the previously introduced mixed precision schemes are still preferable to double precision alone. But the schemes achieve significant quantitative performance improvements with the more powerful hardware. In particular, we demonstrate that a Multigrid scheme can accurately solve a common test problem in Finite Element settings with one million unknowns in less than 0.1 seconds, which is truely outstanding performance. We support these conclusions by exploring the algorithmic design space enlarged by the availability of double precision directly in the hardware.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: