Optimizing Sweep3D for Graphic Processor Unit
Department of Computer Sciences, National University of Defense Technology, 410073 Changsha, P.R. China
Algorithms and Architectures for Parallel Processing, Lecture Notes in Computer Science, 2010, Volume 6081/2010, 416-426
@article{gong2010optimizing,
title={Optimizing Sweep3D for Graphic Processor Unit},
author={Gong, C. and Liu, J. and Gong, Z. and Qin, J. and Xie, J.},
journal={Algorithms and Architectures for Parallel Processing},
pages={416–426},
year={2010},
publisher={Springer}
}
As a powerful and flexible processor, the Graphic Processing Unit (GPU) can offer great faculty in solving many high-performance computing applications. Sweep3D, which simulates a single group time-independent discrete ordinates (Sn) neutron transport deterministically on 3D Cartesian geometry space, represents the key part of a real ASCI application. The wavefront process for parallel computation in Sweep3D limits the concurrent threads on the GPU. In this paper, we present multi-dimensional optimization methods for Sweep3D, which can be efficiently implemented on the fine grained parallel architecture of the GPU. Our results show that the performance of overall Sweep3D on CPU-GPU hybrid platform can be improved up to 2.25 times as compared to the CPU-based implementation.
May 25, 2011 by hgpu