A Performance Comparison of Algebraic Multigrid Preconditioners on CPUs, GPUs, and Xeon Phis

K. Rupp, J. Weinbub, F. Rudolf, A. Morhammer, T. Grasser, A. Jungel
Institute for Microelectronics, TU Wien, Gusshausstrasse 27-29/E360, A-1040 Wien, Austria
Institute for Microelectronics, TU Wien, 2015


   title={A Performance Comparison of Algebraic Multigrid Preconditioners on CPUs, GPUs, and Xeon Phis},

   author={Rupp, K and Weinbub, J and Rudolf, F and Morhammer, A and Grasser, T and J{"u}ngel, A},



Download Download (PDF)   View View   Source Source   



Algebraic multigrid preconditioners for accelerating iterative solvers are a popular choice for a broad range of applications, because they are able to obtain asymptotic optimality, yet can be applied in a black-box manner. However, only a few variants of algebraic multigrid preconditioners can fully benefit from finegrained parallelization available on multi- and many-core architectures. Previous approaches were focused on graphics processing units from NVIDIA without a focus on fair comparisons in terms of power or price. We extend these earlier approaches to current high-end hardware from INTEL, NVIDIA, and AMD. Our results show that GPUs from NVIDIA and AMD are equally well suited for problems which are large enough to hide latencies across the PCI-Express bus, yet small enough to fit into GPU RAM. Purely CPU-based systems offer good performance across a much broader range of problem sizes and are even on par with GPUs in the regime where GPUs yield best performance. While we also demonstrate good performance on Xeon Phis, they were identified as the slowest platform overall in our benchmarks.
No votes yet.
Please wait...

Recent source codes

* * *

* * *

HGPU group © 2010-2019 hgpu.org

All rights belong to the respective authors

Contact us: