Arax: a runtime framework for decoupling applications from heterogeneous accelerators

hgpu.org » Applications » Computer science » Arax: a runtime framework for decoupling applications from heterogeneous accelerators

Arax: a runtime framework for decoupling applications from heterogeneous accelerators

Manos Pavlidakis, Stelios Mavridis, Antony Chazapis, Giorgos Vasiliadis, Angelos Bilas

Institute of Computer Science, Foundation for Research and Technology – Hellas, Greece

Proceedings of the 13th Symposium on Cloud Computing (SoCC ’22), 2022

DOI:10.1145/3542929.3563467

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Arax: a Runtime Framework for Decoupling Applications from Heterogeneous Accelerators

1519

views

Today, using multiple heterogeneous accelerators efficiently from applications and high-level frameworks, such as Tensor-Flow and Caffe, poses significant challenges in three respects: (a) sharing accelerators, (b) allocating available resources elastically during application execution, and (c) reducing the required programming effort. In this paper, we present Arax, a runtime system that decouples applications from heterogeneous accelerators within a server. First, Arax maps application tasks dynamically to available resources, managing all required task state, memory allocations, and task dependencies. As a result, Arax can share accelerators across applications in a server and adjust the resources used by each application as load fluctuates over time. Additionally, Arax offers a simple API and includes Autotalk, a stub generator that automatically generates stub libraries for applications already written for specific accelerator types, such as NVIDIA GPUs. Consequently, Arax applications are written once without considering physical details, including the number and type of accelerators. Our results show that applications, such as Caffe, TensorFlow, and Rodinia, can run using Arax with minimum effort and low overhead compared to native execution, about 12% (geometric mean). Arax supports efficient accelerator sharing, by offering up to 20% improved execution times compared to NVIDIA MPS, which supports NVIDIA GPUs only. Arax can transparently provide elasticity, decreasing total application turn-around time by up to 2X compared to native execution without elasticity support.

Tags: Computer science, CUDA, FPGA, Heterogeneous systems, Neural networks, nVidia, nVidia GeForce RTX 2080, OpenCL, Package

January 8, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org