{"id":17761,"date":"2017-11-12T16:08:19","date_gmt":"2017-11-12T14:08:19","guid":{"rendered":"https:\/\/hgpu.org\/?p=17761"},"modified":"2017-11-12T16:08:19","modified_gmt":"2017-11-12T14:08:19","slug":"best-practice-guide-gpgpu","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=17761","title":{"rendered":"Best Practice Guide &#8211; GPGPU"},"content":{"rendered":"<p>Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphical tasks, but for many years have been exploited for general purpose computing across a number of areas. They offer advantages over traditional CPUs because they have greater computational capability, and use high-bandwidth memory systems (where memory bandwidth is the main bottleneck for many scientific applications). This Best Practice Guide describes GPUs: it includes information on how to get started with programming GPUs, which cannot be used in isolation but as &quot;accelerators&quot; in conjunction with CPUs, and how to get good performance. Focus is given to NVIDIA GPUs, which are most widespread today. In Section 2, &quot;The GPU Architecture&quot;, the GPU architecture is described, with a focus on the latest &quot;Pascal&quot; generation of NVIDIA GPUs, and attention is given to the architectural reasons why GPUs offer performance benefits. This section also includes details of GPU-accelerated services within the PRACE HPC ecosystem. In Section 3, &quot;GPU Programming with CUDA&quot;, the NVIDIA CUDA programming model, which includes the necessary extensions to manage parallel execution and data movement, is described, and it is shown how to write a simple CUDA code. Often it is relatively simple to write a working CUDA application, but more work is needed to get good performance. A range of optimisation techniques are presented in Section 4, &quot;Best Practice for Optimizing Codes on GPUs&quot;. Large-scale applications will require use of multiple GPUs in parallel: this is addressed in Section 5, &quot;Multi-GPU Programming&quot;. Many GPU-enabled libraries exist for common operations: these can facilitate programming in many cases. Some of the popular libraries are described in Section 6, &quot;GPU Libraries&quot;. Finally, CUDA is not the only option for programming GPUs and alternative models are described in Section 7, &quot;Other Programming Models for GPUs&quot;.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphical tasks, but for many years have been exploited for general purpose computing across a number of areas. They offer advantages over traditional CPUs because they have greater computational capability, and use high-bandwidth memory systems (where memory bandwidth is the main bottleneck for [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,89,90,3],"tags":[1782,14,1321,1793,252,31],"class_list":["post-17761","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-nvidia-cuda","category-opencl","category-paper","tag-computer-science","tag-cuda","tag-openacc","tag-opencl","tag-openmp","tag-review"],"views":6269,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/17761","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=17761"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/17761\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=17761"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=17761"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=17761"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}