Applicability of GPU Computing for Efficient Merge in In-Memory Databases

hgpu.org » Programming » Algorithms » Applicability of GPU Computing for Efficient Merge in In-Memory Databases

Applicability of GPU Computing for Efficient Merge in In-Memory Databases

Jens Krueger, Martin Grund, Ingo Jaeckel, Alexander Zeier, Hasso Plattner

Hasso Plattner Institute for IT Systems Engineering, University of Potsdam, Potsdam, Germany

Second International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS 2011), 2011

BibTeX

Download (PDF)

View

Source

1855

views

Column oriented in-memory databases typically use dictionary compression to reduce the overall storage space and allow fast lookup and comparison. However, there is a high performance cost for updates since the dictionary, used for compression, has to be recreated each time records are created, updated or deleted. This has to be taken into account for TPC-C like workloads with around 45% of all queries being transactional modifications. A technique called differential updates can be used to allow faster modifications. In addition to the main storage, the database then maintains a delta storage to accommodate modifying queries. During the merge process, the modifications of the delta are merged with the main storage in parallel to the normal operation of the database. Current hardware and software trends suggest that this problem can be tackled by massively parallelizing the merge process. One approach to massive parallelism are GPUs that offer order of magnitudes more cores than modern CPUs. Therefore, we analyze the feasibility of a parallel GPU merge implementation and its potential speedup. We found that the maximum potential merge speedup is limited since only two of its four stages are likely to benefit from parallelization. We present a parallel dictionary slice merge algorithm as well as an alternative parallel merge algorithm for GPUs that achieves up to 40% more throughput than its CPU implementation. In addition, we propose a parallel duplicate removal algorithm that achieves up to 27 times the throughput of the CPU implementation.

Tags: Algorithms, Compression, Computer science, CUDA, Databases, nVidia, Tesla C2050

November 3, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org