{"id":2746,"date":"2011-02-06T12:38:59","date_gmt":"2011-02-06T12:38:59","guid":{"rendered":"http:\/\/hgpu.org\/?p=2746"},"modified":"2011-02-06T12:38:59","modified_gmt":"2011-02-06T12:38:59","slug":"automatically-translating-a-general-purpose-c-image-processing-library-for-gpus","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=2746","title":{"rendered":"Automatically translating a general purpose C++ image processing library for GPUs"},"content":{"rendered":"<p>This paper presents work-in-progress towards a C++ source-to-source translator that automatically seeks parallelizable code fragments and replaces them with code for a graphics co-processor. We report on our experience with accelerating an industrial image processing library. To increase the effectiveness of our approach, we exploit some domain-specific knowledge of the library&#8217;s semantics. We outline the architecture of our translator and how it uses the ROSE source-to-source transformation library to overcome complexities in the C++ language. Techniques for parallel analysis and source transformation are presented in light of their uses in GPU code generation. We conclude with results from a performance evaluation of two examples, image blending and an erosion filter, hand-translated with our parallelization techniques. We show that our approach has potential and explain some of the remaining challenges in building an effective tool.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This paper presents work-in-progress towards a C++ source-to-source translator that automatically seeks parallelizable code fragments and replaces them with code for a graphics co-processor. We report on our experience with accelerating an industrial image processing library. To increase the effectiveness of our approach, we exploit some domain-specific knowledge of the library&#8217;s semantics. We outline the [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,3],"tags":[955,1782,187,95,20,247,182],"class_list":["post-2746","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-paper","tag-compilers","tag-computer-science","tag-glsl","tag-high-level-languages","tag-nvidia","tag-nvidia-geforce-7800-gtx","tag-opengl"],"views":2042,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2746","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2746"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2746\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2746"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2746"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2746"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}