{"id":11228,"date":"2014-01-12T23:51:15","date_gmt":"2014-01-12T21:51:15","guid":{"rendered":"http:\/\/hgpu.org\/?p=11228"},"modified":"2014-01-12T23:51:15","modified_gmt":"2014-01-12T21:51:15","slug":"gpu-accelerated-parallel-fdtd-on-distributed-heterogeneous-platform","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=11228","title":{"rendered":"GPU-Accelerated parallel FDTD on Distributed Heterogeneous Platform"},"content":{"rendered":"<p>This paper introduces a (Finite-Difference Time-Domain) FDTD code written in Fortran and CUDA for realistic electromagnetic calculations with parallelization methods of Message Passing Interface (MPI) and Open Multi-Processing (OpenMP). Since both Central Processing Unit (CPU) and Graphics Processing Unit (GPU) resources are utilized, a faster execution speed can be reached compared to a traditional pure GPU code. In our experiments, 64 NVIDIA TESLA K20m GPUs and 64 INTEL XEON E5-2670 CPUs are used to carry out the pure CPU, pure GPU and CPU + GPU tests. Relative to the pure CPU calculations for the same problems, the speedup ratio achieved by CPU + GPU calculations is around 14. Compared to the pure GPU calculations for the same problems, the CPU + GPU calculations have 7.6 % &#8211; 13.2 % performance improvement. Because of the small memory size of GPUs, the FDTD problem size is usually very small. However, this code can enlarge the maximum problem size by 25 % without reducing the performance of traditional pure GPU code. Finally, using this code, a microstrip antenna array with 16 x 18 elements is calculated and the radiation patterns are compared with the ones of MoM. Results show that there is a well agreement between them.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This paper introduces a (Finite-Difference Time-Domain) FDTD code written in Fortran and CUDA for realistic electromagnetic calculations with parallelization methods of Message Passing Interface (MPI) and Open Multi-Processing (OpenMP). Since both Central Processing Unit (CPU) and Graphics Processing Unit (GPU) resources are utilized, a faster execution speed can be reached compared to a traditional pure [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[89,319,3],"tags":[14,1802,323,322,989,452,242,20,1390],"class_list":["post-11228","post","type-post","status-publish","format-standard","hentry","category-nvidia-cuda","category-electrodynamics","category-paper","tag-cuda","tag-electrodynamics","tag-fdtd","tag-finite-difference-time-domain","tag-fortran","tag-heterogeneous-systems","tag-mpi","tag-nvidia","tag-tesla-k20"],"views":2377,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/11228","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=11228"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/11228\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=11228"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=11228"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=11228"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}