A Parallel Twig Join Algorithm for XML Processing using a GPGPU
Computer Science Department, Technion
Third International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS’12), 2012
@article{shnaiderman2012practical,
title={A Parallel Twig Join Algorithm for XML Processing using a GPGPU},
author={Shnaiderman, Lila and Shmueli, Oded},
year={2012}
}
With an increasing amount of data and demand for fast query processing, the efficiency of database operations continues to be a challenging task. A common approach is to leverage parallel hardware platforms. With the introduction of general-purpose GPU (Graphics Processing Unit) computing, massively parallel hardware has become available within commodity hardware. XML is based on a tree-structured data model. Naturally, the most popular XML querying language (XPath) uses patterns of selection predicates on multiple elements, related by a tree structure. These are often abstracted by twig patterns. Finding all occurrences of such a (XML query) twig pattern in an XML document is a core operation for XML query processing. We present a new algorithm, GPU-Twig, for matching twig patterns in large XML documents, using a GPU. GPU-Twig uses the data and task parallelism of the GPU to perform memory-intensive tasks whereas the CPU is used to perform I/O and resource management. We efficiently exploit both the high-bandwidth GPU memory interface and the lower-bandwidth CPU main memory. We present the results of an extensive experimentation of the GPU-Twig algorithm on large XML documents using the DBLP and XMark benchmarks. Experimental results indicate that GPUTwig significantly reduces the running time of queries in comparison with other algorithms on CPU based platforms and multicore based platforms under various settings and scenarios.
September 10, 2012 by hgpu