958

GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics

Hsi-Yu Schive, Yu-Chih Tsai, Tzihong Chiueh
Department of Physics, National Taiwan University, 106, Taipei, Taiwan, R.O.C.
arXiv:0907.3390v2 [astro-ph.IM] (20 Jul 2009)

@article{schive2010gamer,

   title={Gamer: A graphic processing unit accelerated adaptive-mesh-refinement code for astrophysics},

   author={Schive, H.Y. and Tsai, Y.C. and Chiueh, T.},

   journal={The Astrophysical Journal Supplement Series},

   volume={186},

   pages={457},

   year={2010},

   publisher={IOP Publishing}

}

Download Download (PDF)   View View   Source Source   

2397

views

We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is made to diminish by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely-baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with 8192^3 effective resolution, respectively.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: