7464

Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study

Jianbin Fang, Ana Lucia Varbanescu, Henk Sips
Parallel and Distributed Systems Group, Delft University of Technology, Delft, the Netherlands
ICT.Open, 2011

@inproceedings{fang2012maximize,

   author={Jianbin Fang and Ana Lucia Varbanescu and Henk Sips},

   title={Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study},

   booktitle={Proceedings of ICT.Open 2011},

   year={2011},

   month={November},

   note={(an extension to the FGC’11 paper)},

   location={Veldhoven, the Netherlands},

   url={http://www.pds.ewi.tudelft.nl/fileadmin/pds/homepages/fang/papers/asci2k11_fang.pdf},

   topic={Parallel Programming},

   group={PDS}

}

Download Download (PDF)   View View   Source Source   

1299

views

In this paper, we analyze the trade-offs encountered when minimizing the total execution time using the rake-based applications on GPUs. We use clustering data streams as a case study, and present a rake-based implementation for it, making it more efficient in terms of memory usage. In order to maximize performance for different problem sizes and architectures, we propose a model-based auto-tuning solution. Experimental results show that our fully optimized implementation can perform 2.1x and 1.4x faster than the native OpenCL implementation on NVIDIA GTX480 and AMD HD5870, respectively; it can also achieve 1.4x to 3.3x speedup relative to the original CUDA implementation solution on GTX480.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: