An Auto-tuned Method for Solving Large Tridiagonal Systems on the GPU

Andrew Davidson, Yao Zhang, John D. Owens
University of California, Davis
IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2011


   title={An auto-tuned method for solving large tridiagonal systems on the GPU},

   author={Davidson, A. and Zhang, Y. and Owens, J.D.},

   booktitle={Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International},





Download Download (PDF)   View View   Source Source   



We present a multi-stage method for solving large tridiagonal systems on the GPU. Previously large tridiagonal systems cannot be efficiently solved due to the limitation of on-chip shared memory size. We tackle this problem by splitting the systems into smaller ones and then solving them on-chip. The multi-stage characteristic of our method, together with various workloads and GPUs of different capabilities, obligates an auto-tuning strategy to carefully select the switch points between computation stages. In particular, we show two ways to effectively prune the tuning space and thus avoid an impractical exhaustive search: (1) apply algorithmic knowledge to decouple tuning parameters, and (2) estimate search starting points based on GPU architecture parameters. We demonstrate that auto-tuning is a powerful tool that improves the performance by up to 5x, saves 17% and 32% of execution time on average respectively over static and dynamic tuning, and enables our multi-stage solver to outperform the Intel MKL tridiagonal solver on many parallel tridiagonal systems by 6-11x.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: