{"id":15027,"date":"2015-12-01T22:20:18","date_gmt":"2015-12-01T20:20:18","guid":{"rendered":"http:\/\/hgpu.org\/?p=15027"},"modified":"2015-12-01T22:20:18","modified_gmt":"2015-12-01T20:20:18","slug":"bridging-opencl-and-cuda-a-comparative-analysis-and-translation","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=15027","title":{"rendered":"Bridging OpenCL and CUDA: A Comparative Analysis and Translation"},"content":{"rendered":"<p>Heterogeneous systems are widening their user-base, and heterogeneous computing is becoming popular in supercomputing. Among others, OpenCL and CUDA are the most popular programming models for heterogeneous systems. Although OpenCL inherited many features from CUDA and they have almost the same platform model, they are not compatible with each other. In this paper, we present similarities and differences between them and propose an automatic translation framework for both OpenCL to CUDA and CUDA to OpenCL. We describe features that make it difficult to translate from one to the other and provide our solution. We show that our translator achieves comparable performance between the original and target applications in both directions. Since each programming model separately has a wide user-base and large code-base, our translation framework is useful to extend the code-base for each programming model and unifies the efforts to develop applications for heterogeneous systems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Heterogeneous systems are widening their user-base, and heterogeneous computing is becoming popular in supercomputing. Among others, OpenCL and CUDA are the most popular programming models for heterogeneous systems. Although OpenCL inherited many features from CUDA and they have almost the same platform model, they are not compatible with each other. In this paper, we present [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,89,90,3],"tags":[7,1307,955,1782,14,452,20,1470,1793],"class_list":["post-15027","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-nvidia-cuda","category-opencl","category-paper","tag-ati","tag-ati-radeon-hd-7970","tag-compilers","tag-computer-science","tag-cuda","tag-heterogeneous-systems","tag-nvidia","tag-nvidia-geforce-gtx-titan","tag-opencl"],"views":3334,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/15027","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15027"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/15027\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15027"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15027"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15027"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}