7464

Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study

Jianbin Fang, Ana Lucia Varbanescu, Henk Sips
Parallel and Distributed Systems Group, Delft University of Technology, Delft, the Netherlands
ICT.Open, 2011
BibTeX

Download Download (PDF)   View View   Source Source   

1496

views

In this paper, we analyze the trade-offs encountered when minimizing the total execution time using the rake-based applications on GPUs. We use clustering data streams as a case study, and present a rake-based implementation for it, making it more efficient in terms of memory usage. In order to maximize performance for different problem sizes and architectures, we propose a model-based auto-tuning solution. Experimental results show that our fully optimized implementation can perform 2.1x and 1.4x faster than the native OpenCL implementation on NVIDIA GTX480 and AMD HD5870, respectively; it can also achieve 1.4x to 3.3x speedup relative to the original CUDA implementation solution on GTX480.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org