Overlapping computation and communication of three-dimensional FDTD on a GPU cluster
Korea Institute of Atmospheric Prediction Systems, Seoul 156-710, Republic of Korea
Computer Physics Communications, Volume 183, Issue 11, Pages 2364-2369, 2012
@article{Kim20122364,
title={"Overlappingcomputationandcommunicationofthree-dimensionalFDTDonaGPUcluster"},
journal={"ComputerPhysicsCommunications"},
volume={"183"},
number={"11"},
pages={"2364-2369"},
year={"2012"},
note={""},
issn={"0010-4655"},
doi={"10.1016/j.cpc.2012.06.003"},
url={"http://www.sciencedirect.com/science/article/pii/S0010465512002044"},
author={"Ki-HwanKimandQ-HanPark"},
keywords={"OpenCL"}
}
Large-scale electromagnetic field simulations using the FDTD (finite-difference time-domain) method require the use of GPU (graphics processing unit) clusters. However, the communication overhead caused by slow interconnections becomes a major performance bottleneck. In this paper, as a way to remove the bottleneck, we propose the "kernel-split method" and the "host-buffer method" which overlap computation and communication for the FDTD simulation on the GPU cluster. The host-buffer method in particular enables overlapping without any modifications to the update-kernels that are already in use. We also present theoretical formulas to predict the overlap threshold and the total throughput for each method. By using our overlap methods with 6 GPU nodes, we demonstrate that the total performance of 3D FDTD reaches 92% of a six-fold increase, which is the upper limit that would be reached if there were no communication overhead.
September 22, 2012 by hgpu