high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Billion-scale similarity search with GPUs

Billion-scale similarity search with GPUs

Jeff Johnson, Matthijs Douze, Herve Jegou

Facebook AI Research

arXiv:1702.08734 [cs.CV], (28 Feb 2017)

@article{johnson2017billionscale,

title={Billion-scale similarity search with GPUs},

author={Johnson, Jeff and Douze, Matthijs and Jegou, Herve},

year={2017},

month={feb},

archivePrefix={"arXiv"},

primaryClass={cs.CV}

}

Download (PDF)

View

Source

Source codes

Package:

faiss: A library for efficient similarity search and clustering of dense vectors

9826

views

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less parallelism, such as k-min selection, or make poor use of the memory hierarchy. We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art. We apply it in different similarity search scenarios, by proposing optimized design for brute-force, approximate and compressed-domain search based on product quantization. In all these setups, we outperform the state of the art by large margins. Our implementation enables the construction of a high accuracy k-NN graph on 95 million images from the Yfcc100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. We have open-sourced our approach for the sake of comparison and reproducibility.

Tags: Computer science, CUDA, Data Structures and Algorithms, Databases, Machine learning, Nearest neighbour, nVidia, nVidia GeForce GTX Titan X, Package

March 5, 2017 by hgpu

Rating: 1.5/5. From 2 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Billion-scale similarity search with GPUs

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Billion-scale similarity search with GPUs

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)