{"id":1579,"date":"2010-11-22T13:50:47","date_gmt":"2010-11-22T13:50:47","guid":{"rendered":"http:\/\/hgpu.org\/?p=1579"},"modified":"2010-11-22T13:50:47","modified_gmt":"2010-11-22T13:50:47","slug":"on-optimization-of-finite-difference-time-domain-fdtd-computation-on-heterogeneous-and-gpu-clusters","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=1579","title":{"rendered":"On optimization of finite-difference time-domain (FDTD) computation on heterogeneous and GPU clusters"},"content":{"rendered":"<p>A model for the computational cost of finite-difference time-domain (FDTD) method irrespective of implementation details or the application domain is given. The model is used to formalize the problem of optimal distribution of computational load to an arbitrary set of resources across a heterogeneous cluster. We show that the problem can be formulated as a minimax optimization problem and derive analytic lower bounds for the computational cost. The work provides insight into optimal design of FDTD parallel software. Our formulation of the load distribution problem takes simultaneously into account the computational and communication costs. We demonstrate that significant performance gains, as much as 75%, can be achieved by proper load distribution.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A model for the computational cost of finite-difference time-domain (FDTD) method irrespective of implementation details or the application domain is given. The model is used to formalize the problem of optimal distribution of computational load to an arbitrary set of resources across a heterogeneous cluster. We show that the problem can be formulated as a [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[11,3],"tags":[1782,323,327,106,20,298],"class_list":["post-1579","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-paper","tag-computer-science","tag-fdtd","tag-finite-difference","tag-gpu-cluster","tag-nvidia","tag-optimization"],"views":2237,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/1579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1579"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/1579\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}