high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Challenges and Techniques for Transparent Acceleration of Unmodified Big Data Applications

Challenges and Techniques for Transparent Acceleration of Unmodified Big Data Applications

Maria Nektaria Xekalaki

University of Manchester

University of Manchester, 2022

@article{xekalaki2022challenges,

title={CHALLENGES AND TECHNIQUES FOR TRANSPARENT ACCELERATION OF UNMODIFIED BIG DATA APPLICATIONS},

author={Xekalaki, Maria Nektaria},

year={2022}

}

Download (PDF)

View

Source

Source codes

Package:

Flink-TornadoVM-Artifact

1522

views

The ever-increasing demand for high-performance Big Data analytics and data processing has paved the way for heterogeneous hardware accelerators, such as Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs), to be integrated into modern Big Data platforms. Currently, this integration comes at the cost of programmability, as the end-user Application Programming Interface (API) of Big Data frameworks must be altered in order to access the underlying heterogeneous hardware. In some cases, it is even required by developers to provide their application code in a low-level programming language that targets specific hardware accelerators (e.g., CUDA, OpenCL, etc.). The purpose of this thesis is to identify the current barriers in the automatic acceleration of Big Data applications and to propose techniques that can lift the emerged restrictions. Specifically, this thesis presents the first Big Data platform that can dynamically take advantage of GPUs and FPGAs for the acceleration of unmodified applications in a completely agnostic manner to the user. This novel heterogeneous platform has been prototyped in the context of Apache Flink, a widely used Big Data platform, and TornadoVM, an open-source framework that automatically compiles and executes Java applications on GPUs, FPGAs, and multi-core CPUs. The techniques that will be presented are not bound to the frameworks used, and can also be applied to other software platforms with slight modifications. The performance evaluation of the proposed solution has been conducted on both standard benchmarks and industrial use cases, showcasing performance speedups of up to 65x on GPUs and 184x on FPGAs, against vanilla Apache Flink running on traditional multi-core CPUs.

Tags: Benchmarking, Computer science, CUDA, FPGA, Heterogeneous systems, Java, nVidia, nVidia GeForce GTX 1060, nVidia Quadro GP100, OpenCL, Package, Performance, Tesla V100, Thesis

November 20, 2022 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Challenges and Techniques for Transparent Acceleration of Unmodified Big Data Applications

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Challenges and Techniques for Transparent Acceleration of Unmodified Big Data Applications

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)