{"id":20721,"date":"2020-04-26T11:42:20","date_gmt":"2020-04-26T08:42:20","guid":{"rendered":"https:\/\/hgpu.org\/?p=20721"},"modified":"2020-04-26T11:42:20","modified_gmt":"2020-04-26T08:42:20","slug":"cpp-taskflow-a-general-purpose-parallel-and-heterogeneous-task-programming-system-at-scale","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=20721","title":{"rendered":"Cpp-Taskflow: A General-purpose Parallel and Heterogeneous Task Programming System at Scale"},"content":{"rendered":"<p>The Cpp-Taskflow project addresses the long-standing question: How can we make it easier for developers to write parallel and heterogeneous programs with high performance and simultaneous high productivity? Cpp-Taskflow develops a simple and powerful task programming model to enable efficient implementations of heterogeneous decomposition strategies. Our programming model empowers users with both static and dynamic task graph constructions to incorporate a broad range of computational patterns including hybrid CPU-GPU computing, dynamic control flow, and irregularity. We develop an efficient heterogeneous work-stealing strategy that adapts worker threads to available task parallelism at any time during the graph execution. We have demonstrated promising performance of Cpp-Taskflow on both micro-benchmark and real-world applications. As an example, we solved a large machine learning workload by up to 1.5x faster, 1.6x less memory, and 1.7x fewer lines of code than two industrial-strength systems, oneTBB and StarPU, on a machine of 40 CPUs and 4 GPUs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Cpp-Taskflow project addresses the long-standing question: How can we make it easier for developers to write parallel and heterogeneous programs with high performance and simultaneous high productivity? Cpp-Taskflow develops a simple and powerful task programming model to enable efficient implementations of heterogeneous decomposition strategies. Our programming model empowers users with both static and dynamic [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,89,3],"tags":[1782,14,452,1025,20,2046,176,67],"class_list":["post-20721","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-nvidia-cuda","category-paper","tag-computer-science","tag-cuda","tag-heterogeneous-systems","tag-machine-learning","tag-nvidia","tag-nvidia-geforce-rtx-2080","tag-package","tag-performance"],"views":2090,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/20721","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=20721"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/20721\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=20721"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=20721"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=20721"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}