{"id":28622,"date":"2023-09-24T16:27:44","date_gmt":"2023-09-24T13:27:44","guid":{"rendered":"https:\/\/hgpu.org\/?p=28622"},"modified":"2023-09-24T16:27:44","modified_gmt":"2023-09-24T13:27:44","slug":"julia-as-a-unifying-end-to-end-workflow-language-on-the-frontier-exascale-system","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=28622","title":{"rendered":"Julia as a unifying end-to-end workflow language on the Frontier exascale system"},"content":{"rendered":"<p>We evaluate using Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy&#8217;s first exascale supercomputer. We evaluate the feasibility, performance, scaling, and trade-offs of (i) the computational kernel on AMD&#8217;s MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes\/GPUs or 512 nodes, (iii) parallel I\/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive data analysis. Our results suggest that although Julia generates a reasonable LLVM-IR kernel, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I\/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition strategy, as measured on the fastest supercomputer in the world.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We evaluate using Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy&#8217;s first exascale supercomputer. We evaluate the feasibility, performance, scaling, and trade-offs of (i) the [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,3],"tags":[2099,7,1782,2063,1682,2068,242,176],"class_list":["post-28622","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-paper","tag-amd-radeon-instinct-mi250x","tag-ati","tag-computer-science","tag-hip","tag-hpc","tag-julia","tag-mpi","tag-package"],"views":1888,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/28622","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=28622"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/28622\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=28622"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=28622"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=28622"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}