6838

Selecting the Best Tridiagonal System Solver Projected on Multi-Core CPU and GPU Platforms

Pablo Quesada-Barriuso, Julian Lamas-Rodriguez, Dora B. Heras, Montserrat Boo, Francisco Arguello
Centro de Investigacion en Tecnoloxias da Informacion (CITIUS), Univ. of Santiago de Compostela, Spain
The 2011 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’11), 2011

@article{quesada2011selecting,

   title={Selecting the Best Tridiagonal System Solver Projected on Multi-Core CPU and GPU Platforms},

   author={Quesada-Barriuso, Pablo and Lamas-Rodriguez, Julian and Heras, Dora B. and Boo, Montserrat and Arguello, Francisco},

   booktitle={The 2011 International Conference on Parallel and Distributed Processing Techniques and Applications},

   year={2011}

}

Download Download (PDF)   View View   Source Source   

788

views

Nowadays multicore processors and graphics cards are commodity hardware that can be found in personal computers. Both CPU and GPU are capable of performing high-end computations. In this paper we present and compare parallel implementations of two tridiagonal system solvers. We analyze the cyclic reduction method, as an example of fine-grained parallelism, and Bondeli’s algorithm, as a coarse-grained example of parallelism. Both algorithms are implemented for GPU architectures using CUDA and multi-core CPU with shared memory architectures using OpenMP. The results are compared in terms of execution time, speedup, and GFLOPS. For a large system of equations, 2^22, the best results were obtained for Bondeli’s algorithm (speedup 1.55x and 0.84 GFLOPS) for multi-core CPU platforms while the cyclic reduction (speedup 17.06x and 5.09 GFLOPS) was the best for the case of GPU platforms.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: