high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Billion-scale similarity search with GPUs

Billion-scale similarity search with GPUs

Jeff Johnson, Matthijs Douze, Herve Jegou

Facebook AI Research

arXiv:1702.08734 [cs.CV], (28 Feb 2017)

BibTeX

Download (PDF)

View

Source

Source codes

Package:

faiss: A library for efficient similarity search and clustering of dense vectors

2762

views

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less parallelism, such as k-min selection, or make poor use of the memory hierarchy. We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art. We apply it in different similarity search scenarios, by proposing optimized design for brute-force, approximate and compressed-domain search based on product quantization. In all these setups, we outperform the state of the art by large margins. Our implementation enables the construction of a high accuracy k-NN graph on 95 million images from the Yfcc100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. We have open-sourced our approach for the sake of comparison and reproducibility.

Tags: Computer science, CUDA, Data Structures and Algorithms, Databases, Machine learning, Nearest neighbour, nVidia, nVidia GeForce GTX Titan X, Package

March 5, 2017 by hgpu

Rating: 1.5/5. From 2 votes.

Please wait...

high performance computing on graphics processing units: hgpu.org

Billion-scale similarity search with GPUs

Package:

Recent source codes

Shamrock: Multi-GPU hydrodynamics for astrophysics

LLMPerf: GPU Performance Modeling meets Large Language Models

Hercules: A Compiler for Productive Programming of Heterogeneous Systems

Celerity Runtime: High-level C++ for Accelerator Clusters

wgpy: WebGL accelerated numpy-compatible array library for web browser

Microbenchmarking OpenMP target offload with Catch2

SUperman: Highly Efficient Permanent Computation Library

TransCL: An Automatic CUDA-to-OpenCL Programs Transformation Framework

pyATF: The Auto-Tuning Framework (ATF) in Python

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Most viewed papers (last 30 days)

Billion-scale similarity search with GPUs

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)