high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Accelerating K-Means on the Graphics Processor via CUDA

Accelerating K-Means on the Graphics Processor via CUDA

Mario Zechner, Michael Granitzer

Know-Center, Inffeldgasse 21a, 8010 Graz, Austria

First International Conference on Intensive Applications and Services, 2009. INTENSIVE ’09, p.7-15

DOI:10.1109/INTENSIVE.2009.19

BibTeX

Download (PDF)

View

Source

1891

views

In this paper an optimized k-means implementation on the graphics processing unit (GPU) is presented. NVIDIApsilas compute unified device architecture (CUDA), available from the G80 GPU family onwards, is used as the programming environment. Emphasis is placed on optimizations directly targeted at this architecture to best exploit the computational capabilities available. Additionally drawbacks and limitations of previous related work, e.g. maximum instance, dimension and centroid count are addressed. The algorithm is realized in a hybrid manner, parallelizing distance calculations on the GPU while sequentially updating cluster centroids on the CPU based on the results from the GPU calculations. An empirical performance study on synthetic data is given, demonstrating a maximum 14times speed increase to a fully SIMD optimized CPU implementation.

Tags: Computer science, CUDA, Data mining, nVidia, nVidia GeForce 9600 GT

December 18, 2010 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Accelerating K-Means on the Graphics Processor via CUDA

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Accelerating K-Means on the Graphics Processor via CUDA

Share this:

Recent source codes

Most viewed papers (last 30 days)