gSMat: A Scalable Sparse Matrix-based Join for SPARQL Query Processing

Xiaowang Zhang, Mingyue Zhang, Peng Peng, Jiaming Song, Zhiyong Feng, Lei Zou
School of Computer Science and Technology, Tianjin University, Tianjin, China
arXiv:1807.07691 [cs.DB], (20 Jul 2018)


   title={gSMat: A Scalable Sparse Matrix-based Join for SPARQL Query Processing},

   author={Zhang, Xiaowang and Zhang, Mingyue and Peng, Peng and Song, Jiaming and Feng, Zhiyong and Zou, Lei},






Download Download (PDF)   View View   Source Source   



Resource Description Framework (RDF) has been widely used to represent information on the web, while SPARQL is a standard query language to manipulate RDF data. Given a SPARQL query, there often exist many joins which are the bottlenecks of efficiency of query processing. Besides, the real RDF datasets often reveal strong data sparsity, which indicates that a resource often only relates to a few resources even the number of total resources is large. In this paper, we propose a sparse matrix-based (SM-based) SPARQL query processing approach over RDF datasets which con- siders both join optimization and data sparsity. Firstly, we present a SM-based storage for RDF datasets to lift the storage efficiency, where valid edges are stored only, and then introduce a predicate- based hash index on the storage. Secondly, we develop a scalable SM-based join algorithm for SPARQL query processing. Finally, we analyze the overall cost by accumulating all intermediate results and design a query plan generated algorithm. Besides, we extend our SM-based join algorithm on GPU for parallelizing SPARQL query processing. We have evaluated our approach compared with the state-of-the-art RDF engines over benchmark RDF datasets and the experimental results show that our proposal can significantly improve SPARQL query processing with high scalability.
Rating: 3.0/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: