high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » A CUDA-based parallel implementation of K-nearest neighbor algorithm

A CUDA-based parallel implementation of K-nearest neighbor algorithm

Shenshen Liang, Ying Liu, Cheng Wang, Liheng Jian

Graduate University of Chinese Academy of Sciences, Beijing, China

International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, 2009. CyberC ’09

DOI:10.1109/CYBERC.2009.5399145

BibTeX

Source

2251

views

Recent developments in Graphics Processing Units (GPUs) have enabled inexpensive high performance computing for general-purpose applications. Due to GPU’s tremendous computing capability, it has emerged as the co-processor of the CPU to achieve a high overall throughput. CUDA programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. K-nearest neighbor (KNN) is a widely used classification technique and has significant applications in various domains, especially in text classification. The computational-intensive nature of KNN requires a high performance implementation. In this paper, we present a CUDA-based parallel implementation of KNN, CUKNN, using CUDA multi-thread model, where the data elements are processed in a data-parallel fashion. Various CUDA optimization techniques are applied to maximize the utilization of the GPU. CUKNN outperforms the serial KNN on an HP xw8600 workstation significantly, achieving up to 46.71X speedup including I/O time. It also shows good scalability when varying the dimension of the reference dataset, the number of records in the reference dataset, and the number of records in the query dataset.

Tags: Algorithms, Computer science, CUDA, Nearest neighbour, nVidia, Optimization

September 5, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A CUDA-based parallel implementation of K-nearest neighbor algorithm

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

A CUDA-based parallel implementation of K-nearest neighbor algorithm

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)