Efficient Parallel Nonnegative Least Squares on Multicore Architectures

Yuancheng Luo, Ramani Duraiswami
Department of Computer Science, University of Maryland, Room 3368 A.V. Williams, College Park, MD 20740
SIAM J. on Scientific Computing, Volume 33, Issue 5, pp. 2848-2863, 2011



   author={Luo, Y. and Duraiswami, R.},

   journal={SIAM Journal on Scientific Computing},





Download Download (PDF)   View View   Source Source   Source codes Source codes



We parallelize a version of the active-set iterative algorithm derived from the original works of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the $QR$ factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: