Overlapping computation and communication of three-dimensional FDTD on a GPU cluster
Korea Institute of Atmospheric Prediction Systems, Seoul 156-710, Republic of Korea
Computer Physics Communications, Volume 183, Issue 11, Pages 2364-2369, 2012
Large-scale electromagnetic field simulations using the FDTD (finite-difference time-domain) method require the use of GPU (graphics processing unit) clusters. However, the communication overhead caused by slow interconnections becomes a major performance bottleneck. In this paper, as a way to remove the bottleneck, we propose the "kernel-split method" and the "host-buffer method" which overlap computation and communication for the FDTD simulation on the GPU cluster. The host-buffer method in particular enables overlapping without any modifications to the update-kernels that are already in use. We also present theoretical formulas to predict the overlap threshold and the total throughput for each method. By using our overlap methods with 6 GPU nodes, we demonstrate that the total performance of 3D FDTD reaches 92% of a six-fold increase, which is the upper limit that would be reached if there were no communication overhead.
September 22, 2012 by hgpu
Your response
You must be logged in to post a comment.