Adapting database components to heterogeneous environments

hgpu.org » Applications » Computer science » Adapting database components to heterogeneous environments

Adapting database components to heterogeneous environments

Dimitrios Koutsoukos

ETH Zurich

ETH Zurich, 2024

DOI:10.3929/ethz-b-000679438

@phdthesis{koutsoukos2024adapting,

title={Adapting database components to heterogeneous environments},

author={Koutsoukos, Dimitrios},

year={2024},

school={ETH Zurich}

}

Download (PDF)

View

Source

368

views

Data management has seen rapid evolution during the last years, influenced by factors such as data explosion, the prevalence of machine and deep learning, the slowdown of Moore’s law and the popularity of hardware accelerators. Data processing systems are trying to adapt to all these trends by building monolithic and highly specialized systems, which are blazingly fast but quickly become obsolete. In this thesis, we show how to adapt data processing components that easily and quickly adapt to the underlying technology and new applications and deployments. We first explore software abstractions and how we can use them for general data processing. We find that a more granular implementation of traditional relational operators, called sub-operators, can be used to port a query engine to different platforms. Then, we investigate tensors as a core processing element for graph and relational algorithms. We utilize tensor computing runtimes and we demonstrate tensors to be a suitable representation to accelerate execution and adapt to the underlying hardware automatically. Afterwards, we shift our attention to the I/O bottleneck and data movement problem, one of the biggest challenges databases face today. We perform an extensive experimental data analysis on Intel Optane Persistent Memory, the first public implementation of Non-volatile memory. Our analysis provides insights on how to best use it and showcases the weak points of the hardware. Data movement is also a difficult problem for end users in the cloud, given that cloud providers have conflicting profit and performance goals. Therefore, we introduce serverless cracking, a modern form of database cracking adapted to QaaS that can lower both the execution time and cost for infrequent workloads based on semi-structured data. Both approaches demonstrate that, although we have made steps towards improving I/O performance, there is still much ground to cover in this area. Our findings show how to make data management systems more robust to software and hardware evolution and how to adapt and adopt them to the expanding data demands. We also underscore the critical need for continued innovation in reducing the data movement bottleneck and we make a step towards developing responsive and more efficient data processing systems.

Tags: Computer science, CUDA, Databases, Heterogeneous systems, nVidia, nVidia P100, Thesis

June 30, 2024 by hgpu

No votes yet.

Please wait...

Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL

See all packages

* * *

high performance computing on graphics processing units: hgpu.org