Synergia CUDA: GPU-accelerated accelerator modeling package
Scientific Computing Division, Fermi National Accelerator Laboratory, P.O.Box 500, Batavia, Illinois 60510, U.S.
Journal of Physics: Conference Series, 513, 052021, 2014
@inproceedings{lu2014synergia,
title={Synergia CUDA: GPU-accelerated accelerator modeling package},
author={Lu, Q and Amundson, J},
booktitle={Journal of Physics: Conference Series},
volume={513},
number={5},
pages={052021},
year={2014},
organization={IOP Publishing}
}
Synergia is a parallel, 3-dimensional space-charge particle-in-cell accelerator modeling code. We present our work porting the purely MPI-based version of the code to a hybrid of CPU and GPU computing kernels. The hybrid code uses the CUDA platform in the same framework as the pure MPI solution. We have implemented a lock-free collaborative charge-deposition algorithm for the GPU, as well as other optimizations, including local communication avoidance for GPUs, a customized FFT, and fine-tuned memory access patterns. On a small GPU cluster (up to 4 Tesla C1070 GPUs), our benchmarks exhibit both superior peak performance and better scaling than a CPU cluster with 16 nodes and 128 cores. We also compare the code performance on different GPU architectures, including C1070 Tesla and K20 Kepler.
June 17, 2014 by hgpu