high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

Esteban M. Rangel, S. John Pennycook, Adrian Pope, Nicholas Frontiere, Zhiqiang Ma, Varsha Madananth

Argonne National Laboratory, USA

arXiv:2310.16122 [cs.PF], (24 Oct 2023)

DOI:10.48550/arXiv.2310.16122

@misc{rangel2023performanceportable,

title={A Performance-Portable SYCL Implementation of CRK-HACC for Exascale},

author={Esteban M. Rangel and S. John Pennycook and Adrian Pope and Nicholas Frontiere and Zhiqiang Ma and Varsha Madananth},

year={2023},

eprint={2310.16122},

archivePrefix={arXiv},

primaryClass={cs.PF}

}

Download (PDF)

View

Source

1358

views

The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple vendors. As a result, many developers are interested in adopting portable programming models to avoid maintaining multiple versions of their code. It is necessary to document experiences with such programming models to assist developers in understanding the advantages and disadvantages of different approaches. To this end, this paper evaluates the performance portability of a SYCL implementation of a large-scale cosmology application (CRK-HACC) running on GPUs from three different vendors: AMD, Intel, and NVIDIA. We detail the process of migrating the original code from CUDA to SYCL and show that specializing kernels for specific targets can greatly improve performance portability without significantly impacting programmer productivity. The SYCL version of CRK-HACC achieves a performance portability of 0.96 with a code divergence of almost 0, demonstrating that SYCL is a viable programming model for performance-portable applications.

Tags: AMD Radeon Instinct MI250X, ATI, Computer science, Cosmology, CUDA, HIP, HPC, nVidia, nVidia A100, Performance, performance portability, Physics, SYCL

October 29, 2023 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)