Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units
Institute for Microelectronics, TU Wien, Gusshausstrasse 27-29/E360, A-1040 Wien, Austria
arXiv:1410.4054 [cs.MS], (15 Oct 2014)
@article{rupp2014pipelined,
title={Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units},
author={Rupp, Karl and Weinbub, Josef and Jungel, Ansgar and Grasser, Tibor},
year={2014}
}
We revisit the implementation of iterative solvers on discrete graphics processing units and demonstrate the benefit of implementations using extensive kernel fusion for pipelined formulations over conventional implementations of classical formulations. The proposed implementations with both CUDA and OpenCL are freely available in ViennaCL and achieve up to three-fold performance gains when compared to other solver packages for graphics processing units. Highest performance gains are obtained for small to medium-sized systems, while our implementations remain competitive with vendor-tuned implementations for very large systems. Our results are especially beneficial for transient problems, where many small to medium-sized systems instead of a single big system need to be solved.
October 16, 2014 by hgpu