{"id":11737,"date":"2014-03-26T23:05:48","date_gmt":"2014-03-26T21:05:48","guid":{"rendered":"http:\/\/hgpu.org\/?p=11737"},"modified":"2014-03-26T23:05:48","modified_gmt":"2014-03-26T21:05:48","slug":"accelerating-gpu-implementation-of-contourlet-transform","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=11737","title":{"rendered":"Accelerating GPU Implementation of Contourlet Transform"},"content":{"rendered":"<p>The widespread usage of the contourlet-transform (CT) and today\u2019s real-time needs demand faster execution of CT. Solutions are available, but due to lack of portability or computational intensity, they are disadvantageous in real-time applications. In this paper we take advantage of modern GPUs for the acceleration purpose. GPU is well-suited to address data-parallel computation applications such as CT. The convolution part of CT, which is the most computational intensive step, is reshaped for parallel processing. Then the whole transform is transported into GPU to avoid multiple time consuming migrations between the host and device. Experimental results show that with existing GPUs, CT execution achieves more than 19x speedup as compared to its non-parallel CPU-based method. It takes approximately 40ms to compute the transform of a 512\u00d7512 image, which should be sufficient for real-time applications.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The widespread usage of the contourlet-transform (CT) and today\u2019s real-time needs demand faster execution of CT. Solutions are available, but due to lack of portability or computational intensity, they are disadvantageous in real-time applications. In this paper we take advantage of modern GPUs for the acceleration purpose. GPU is well-suited to address data-parallel computation applications [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[36,89,33,3],"tags":[1787,14,1786,20,1573,1091],"class_list":["post-11737","post","type-post","status-publish","format-standard","hentry","category-algorithms","category-nvidia-cuda","category-image-processing","category-paper","tag-algorithms","tag-cuda","tag-image-processing","tag-nvidia","tag-nvidia-geforce-610-m","tag-nvidia-geforce-gtx-570"],"views":2943,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/11737","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=11737"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/11737\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=11737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=11737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=11737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}