Structured Orthogonal Inversion of Block p-Cyclic Matrices on Multicore with GPU Accelerators
Donetsk National Technical University, Donetsk, 83001, Ukraine
Lecture Notes in Computer Science, F. Silva, I. Dutra and V. Santos Costa, eds. 8632, 524, 2014
@article{gogolenko2014structured,
title={Structured Orthogonal Inversion of Block p-Cyclic Matrices on Multicore with GPU Accelerators},
author={Gogolenko, Sergiy and Bai, Zhaojun and Scalettar, Richard},
year={2014}
}
We present a block structured orthogonal factorization (BSOF) algorithm and its parallelization for computing the inversion of block p-cyclic matrices.We aim at the high performance on multicores with GPU accelerators. We provide a quantitative performance model for optimal host-device load balance, and validate the model through numerical tests. Benchmarking results show that the parallel BSOF based inversion algorithm attains up to 90% of DGEMM performance on hybrid CPU+GPU systems.
August 23, 2014 by hgpu