Code Generation for a Variety of Accelerators for a Graph DSL
IIT Madras, India
arXiv:2401.02472 [cs.DC], (4 Jan 2024)
@misc{kumar2024code,
title={Code Generation for a Variety of Accelerators for a Graph DSL},
author={Ashwina Kumar and M. Venkata Krishna and Prasanna Bartakke and Rahul Kumar and Rajesh Pandian M and Nibedita Behera and Rupesh Nasre},
year={2024},
eprint={2401.02472},
archivePrefix={arXiv},
primaryClass={cs.DC}
}
Sparse graphs are ubiquitous in real and virtual worlds. With the phenomenal growth in semi-structured and unstructured data, sizes of the underlying graphs have witnessed a rapid growth over the years. Analyzing such large structures necessitates parallel processing, which is challenged by the intrinsic irregularity of sparse computation, memory access, and communication. It would be ideal if programmers and domain-experts get to focus only on the sequential computation and a compiler takes care of auto-generating the parallel code. On the other side, there is a variety in the number of target hardware devices, and achieving optimal performance often demands coding in specific languages or frameworks. Our goal in this work is to focus on a graph DSL which allows the domain-experts to write almost-sequential code, and generate parallel code for different accelerators from the same algorithmic specification. In particular, we illustrate code generation from the StarPlat graph DSL for NVIDIA, AMD, and Intel GPUs using CUDA, OpenCL, SYCL, and OpenACC programming languages. Using a suite of ten large graphs and four popular algorithms, we present the efficacy of StarPlat’s versatile code generator.
January 14, 2024 by hgpu