A Mixed-Precision Algorithm for the Solution of Lyapunov Equations on Hybrid CPU-GPU Platforms

hgpu.org » Programming » CUDA » A Mixed-Precision Algorithm for the Solution of Lyapunov Equations on Hybrid CPU-GPU Platforms

A Mixed-Precision Algorithm for the Solution of Lyapunov Equations on Hybrid CPU-GPU Platforms

Peter Benner, Pablo Ezzatti, Daniel Kressner, Enrique S. Quintana-Orti, Alfredo Remon

Max-Planck-Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, D-39106 Magdeburg (Germany)

Parallel Computing (28 December 2010)

DOI:10.1016/j.parco.2010.12.002

@article{Benner2010,

title={“AMixed-PrecisionAlgorithmfortheSolutionofLyapunovEquationsonHybridCPU-GPUPlatforms”},

journal={“ParallelComputing”},

volume={“InPress},

number={“”},

pages={“-“},

year={“2010”},

note={“”},

issn={“0167-8191”},

doi={“DOI:10.1016/j.parco.2010.12.002”},

url={“http://www.sciencedirect.com/science/article/B6V12-51TGFYJ-1/2/5d4fd77109f9323d4b056cca549fe813”},

author={“PeterBennerandPabloEzzattiandDanielKressnerandEnriqueS.Quintana-OrtiandAlfredoRemon”},

keywords={“modelreduction”}

}

Download (PDF)

View

Source

1486

views

We describe a hybrid Lyapunov solver based on the matrix sign function, where the intensive parts of the computation are accelerated using a graphics processor (GPU) while executing the remaining operations on a general-purpose multi-core processor (CPU). The initial stage of the iteration operates in single-precision arithmetic, returning a low-rank factor of an approximate solution. As the main computation in this stage consists of explicit matrix inversions, we suggest a hybrid implementation of Gauß-Jordan elimination using look-ahead to overlap computations on GPU and CPU.To improve the approximate solution, we introduce an iterative refinement procedure that allows to cheaply recover full double-precision accuracy. In contrast to earlier approaches to iterative refinement for Lyapunov equations, this approach retains the low-rank factorization structure of the approximate solution. The combination of the two stages results in a mixed-precision algorithm, that exploits the capabilities of both general-purpose CPUs and many-core GPUs and overlaps critical computations. Numerical experiments using real-world data and a platform equipped with two INTEL Xeon QuadCore processors and an NVIDIA Tesla C1060 show a significant efficiency gain of the hybrid method compared to a classical CPU implementation.

Tags: CUBLAS, CUDA, Mathematics, Mixed precision, nVidia, Tesla C1060

January 16, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org