high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Approximate Similarity Search for Online Multimedia Services on Distributed CPU-GPU Platforms

Approximate Similarity Search for Online Multimedia Services on Distributed CPU-GPU Platforms

George Teodoro, Eduardo Valle, Nathan Mariano, Ricardo Torres, Wagner Meira Jr, Joel H. Saltz

Center for Comprehensive Informatics, Emory University, GA, USA

arXiv:1209.0410v1 [cs.MM] (3 Sep 2012)

@article{2012arXiv1209.0410T,

author={Teodoro, George and Valle, Eduardo and Mariano, Nathan and Torres, Ricardo and Meira Jr, Wagner and Saltz, Joel H.},

title={"{Approximate Similarity Search for Online Multimedia Services on Distributed CPU-GPU Platforms}"},

journal={ArXiv e-prints},

archivePrefix={"arXiv"},

eprint={1209.0410},

primaryClass={"cs.MM"},

keywords={Multimedia,Databases,Distributed, Parallel, and Cluster Computing},

year={2012},

month={sep}

}

Download (PDF)

View

Source

2050

views

Similarity search in high-dimentional spaces is a pivotal operation found a variety of database applications. Recently, there has been an increase interest in similarity search for online content-based multimedia services. Those services, however, introduce new challenges with respect to the very large volumes of data that have to be indexed/searched, and the need to minimize response times observed by the end-users. Additionally, those users dynamically interact with the systems creating fluctuating query request rates, requiring the search algorithm to adapt in order to better utilize the underline hardware to reduce response times. In order to address these challenges, we introduce hypercurves, a flexible framework for answering approximate k-nearest neighbor (kNN) queries for very large multimedia databases, aiming at online content-based multimedia services. Hypercurves executes on hybrid CPU–GPU environments, and is able to employ those devices cooperatively to support massive query request rates. In order to keep the response times optimal as the request rates vary, it employs a novel dynamic scheduler to partition the work between CPU and GPU. Hypercurves was throughly evaluated using a large database of multimedia descriptors. Its cooperative CPU–GPU execution achieved performance improvements of up to 30x when compared to the single CPU-core version. The dynamic work partition mechanism reduces the observed query response times in about 50% when compared to the best static CPU–GPU task partition configuration. In addition, Hypercurves achieves superlinear scalability in distributed (multi-node) executions, while keeping a high guarantee of equivalence with its sequential version — thanks to the proof of probabilistic equivalence, which supported its aggressive parallelization design.

Tags: Algorithms, Computer science, CUDA, Databases, Nearest neighbour, nVidia, nVidia GeForce GTX 260, nVidia GeForce GTX 470

September 4, 2012 by hgpu

No votes yet.

Please wait...