{"id":7449,"date":"2012-04-17T20:46:15","date_gmt":"2012-04-17T17:46:15","guid":{"rendered":"http:\/\/hgpu.org\/?p=7449"},"modified":"2012-04-17T21:07:07","modified_gmt":"2012-04-17T18:07:07","slug":"high-performance-stencil-code-algorithms-for-gpgpus","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=7449","title":{"rendered":"High Performance Stencil Code Algorithms for GPGPUs"},"content":{"rendered":"<p>In this paper we investigate how stencil computations can be<br \/>\n  implemented on state-of-the-art general purpose graphics processing<br \/>\n  units (GPGPUs). Stencil codes can be found at the core of many<br \/>\n  numerical solvers and physical simulation codes and are therefore of<br \/>\n  particular interest to scientific computing research. GPGPUs have<br \/>\n  gained a lot of attention recently because of their superior<br \/>\n  floating point performance and memory bandwidth. Nevertheless,<br \/>\n  especially memory bound stencil codes have proven to be challenging<br \/>\n  for GPGPUs, yielding lower than to be expected speedups.<\/p>\n<p>  We chose the Jacobi method as a standard benchmark to evaluate a set<br \/>\n  of algorithms on NVIDIA&#8217;s latest Fermi chipset. One of our fastest<br \/>\n  algorithms is a parallel wavefront update. It exploits the enlarged<br \/>\n  on-chip shared memory to perform two time step updates per sweep. To<br \/>\n  the best of our knowledge, it represents the first successful<br \/>\n  application of temporal blocking for 3D stencils on GPGPUs and<br \/>\n  thereby exceeds previous results by a considerable margin. It is also<br \/>\n  the first paper to study stencil codes on Fermi.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this paper we investigate how stencil computations can be implemented on state-of-the-art general purpose graphics processing units (GPGPUs). Stencil codes can be found at the core of many numerical solvers and physical simulation codes and are therefore of particular interest to scientific computing research. GPGPUs have gained a lot of attention recently because of [&hellip;]<\/p>\n","protected":false},"author":65,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,89,3],"tags":[1782,20,379,378],"class_list":["post-7449","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-nvidia-cuda","category-paper","tag-computer-science","tag-nvidia","tag-nvidia-geforce-gtx-480","tag-tesla-c2050"],"views":1911,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/7449","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/65"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7449"}],"version-history":[{"count":4,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/7449\/revisions"}],"predecessor-version":[{"id":7453,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/7449\/revisions\/7453"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}