NQueens on CUDA: Optimization Issues

Frank Feinbube, Bernhard Rabe, Martin von Lowis, Andreas Polze
Ninth International Symposium on Parallel and Distributed Computing (ISPDC), 2010


   title={NQueens on CUDA: Optimization Issues},

   author={Feinbube, F. and Rabe, B. and von L{\”o}wis, M. and Polze, A.},

   booktitle={2010 Ninth International Symposium on Parallel and Distributed Computing},





Source Source   



Todays commercial off-the-shelf computer systems are multicore computing systems as a combination of CPU, graphic processor (GPU) and custom devices. In comparison with CPU cores, graphic cards are capable to execute hundreds up to thousands compute units in parallel. To benefit from these GPU computing resources, applications have to be parallelized and adapted to the target architecture. In this paper we show our experience in applying the NQueens puzzle solution on GPUs using Nvidia’s CUDA (Compute Unified Device Architecture) technology. Using the example of memory usage and memory access, we demonstrate that optimizations of CUDA programs may have contrary results on different CUDA architectures. Evaluation results will point out, that it is not sufficient to use new programming languages or compilers to achieve best results with emerging graphic card computing.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: