Performance evaluation of the multi-device OpenCL FDTD solver
ETH Zurich, Integrated Systems Laboratory, Gloriastrasse 35, 8092, Switzerland
Proceedings of the 5th European Conference on Antennas and Propagation (EUCAP), 2011
@inproceedings{stefanskiperformance,
title={Performance evaluation of the multi-device OpenCL FDTD solver},
author={Stefanski, T.P. and Chavannes, N. and Kuster, N.},
booktitle={Antennas and Propagation (EUCAP), Proceedings of the 5th European Conference on},
pages={3995–3998},
organization={IEEE},
year={2011}
}
We present results of an evaluation of a multi-device OpenCL FDTD solver. Portability between hardware manufactured by different vendors and also between highly specialized and parallel computing architectures available on the market, i.e. GPUs, multi-core CPUs and devices integrating both technologies in a single-die IC, is the main advantage of this solver. For code execution on GPUs, the computational domain is decomposed along the slowest direction, and electromagnetic field boundary data is shared between neighboring subdomains. The communication overhead between GPUs is proportional to the area of the boundary and represents the rate-limiting step of the method. Utilized hardware devices allow the communication overhead to be hidden by computations for sufficiently large simulation domains, giving a scaling efficiency higher than 90%. CPUs placed in different sockets on a motherboard are visible by the OpenCL driver as a single computing device with an aggregated number of cores, thus decomposition of the domain is not necessary for solver execution on multi-core CPUs. The paper subsequently shows results of numerical tests aimed at evaluation of the developed code in realistic simulations of problems in computational electromagnetics.
June 21, 2011 by hgpu