{"id":9413,"date":"2013-05-19T22:10:26","date_gmt":"2013-05-19T19:10:26","guid":{"rendered":"http:\/\/hgpu.org\/?p=9413"},"modified":"2013-05-19T22:10:26","modified_gmt":"2013-05-19T19:10:26","slug":"an-implementation-of-level-set-based-topology-optimization-using-gpu","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=9413","title":{"rendered":"An implementation of level set based topology optimization using GPU"},"content":{"rendered":"<p>This work presents the implementation of a topology optimization approach based on level set method in massively parallel computer architectures, in particular on a Graphics Processing Unit (GPU). Such architectures are becoming so popular during last years for complex and tedious scientific computation. They are composed of dozens, hundreds, or even thousands of cores specially designed for parallel computing. The speedup process consists of using these graphic units to exploit data parallelism of expensive and parallelizable parts of the method, while non-parallelizable parts are calculated in standard processing units (CPUs). The paper analyzes the computational complexity of the different steps of the method. The parallelization of both the finite element method and the specific operations of the optimization approach are also analyzed. The implementation of the method is benchmarked with some tests. The massively parallel results are compared with the sequential version of the method. The results show the advantages and disadvantages of the implementation of this method using GPU.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This work presents the implementation of a topology optimization approach based on level set method in massively parallel computer architectures, in particular on a Graphics Processing Unit (GPU). Such architectures are becoming so popular during last years for complex and tedious scientific computation. They are composed of dozens, hundreds, or even thousands of cores specially [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[89,33,3],"tags":[659,14,263,1037,212,1786,20,1006],"class_list":["post-9413","post","type-post","status-publish","format-standard","hentry","category-nvidia-cuda","category-image-processing","category-paper","tag-computational-complexity","tag-cuda","tag-data-parallelism","tag-fem","tag-finite-element-method","tag-image-processing","tag-nvidia","tag-tesla-c2070"],"views":3854,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/9413","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9413"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/9413\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9413"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9413"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9413"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}