18699

HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines

Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, Anastasia Ailamaki
EPFL
45th International Conference on Very Large Data Bases, Los Angeles, California, USA, August 26-30, 2019

@article{Chrysogelos:262531,

   title={HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines},

   author={Chrysogelos, Periklis and Karpathiotakis, Manos and Appuswamy, Raja and Ailamaki, Anastasia},

   journal={Proceedings of the VLDB Endowment},

   pages={13},

   year={2019},

   url={http://infoscience.epfl.ch/record/262531}

}

Download Download (PDF)   View View   Source Source   

1642

views

Modern server hardware is increasingly heterogeneous as hardware accelerators, such as GPUs, are used together with multicore CPUs to meet the computational demands of modern data analytics workloads. Unfortunately, query parallelization techniques used by analytical database engines are designed for homogeneous multicore servers, where query plans are parallelized across CPUs to process data stored in cache coherent shared memory. Thus, these techniques are unable to fully exploit available heterogeneous hardware, where one needs to exploit task-parallelism of CPUs and data-parallelism of GPUs for processing data stored in a deep, non-cache-coherent memory hierarchy with widely varying access latencies and bandwidth. In this paper, we introduce HetExchange-a parallel query execution framework that encapsulates the heterogeneous parallelism of modern multi-CPU-multi-GPU servers and enables the parallelization of (pre-)existing sequential relational operators. In contrast to the interpreted nature of traditional Exchange, HetExchange is designed to be used in conjunction with JIT compiled engines in order to allow a tight integration with the proposed operators and generation of efficient code for heterogeneous hardware. We validate the applicability and efficiency of our design by building a prototype that can operate over both CPUs and GPUs, and enables its operators to be parallelism- and data-location-agnostic. In doing so, we show that efficiently exploiting CPU-GPU parallelism can provide 2.8x and 6.4x improvement in performance than state-of-the-art CPU-based and GPU-based DBMS.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: