high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Using of GPUs for cluster analysis of large data by K-means method

Using of GPUs for cluster analysis of large data by K-means method

Natalya Litvinenko

Institute of mathematics and mathematical modeling NAS RK, Al-Farabi Kazakh National University, Almaty, Kazakhstan

arXiv:1402.3788 [cs.DC], (16 Feb 2014)

@article{2014arXiv1402.3788L,

author={Litvinenko}, N.},

title={"{Using of GPUs for cluster analysis of large data by K-means method}"},

journal={ArXiv e-prints},

archivePrefix={"arXiv"},

eprint={1402.3788},

primaryClass={"cs.DC"},

keywords={Computer Science – Distributed, Parallel, and Cluster Computing, 68W10, B.2.4, C.1.2, C.1.4},

year={2014},

month={feb},

adsurl={http://adsabs.harvard.edu/abs/2014arXiv1402.3788L},

adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download (PDF)

View

Source

2253

views

This problem was solved within the framework of the grant project "Solving of problems of cluster analysis with application of parallel algorithms and cloud technologies" in the Institute of Mathematics and Mathematical Modelling in Almaty. The problem of cluster analysis for the large amount of data is very important in different areas of science – genetics, biology, sociology etc. At the same time, such statistical known packages as STATISTICA, STADIA, SYSTAT and others do not allow to solve large problems. The new algorithm that uses the high processing power of GPUs for solving clustering problems by the K-means method was developed. This algorithm is implemented as a C++ application in Microsoft Visual Studio 2010 with using the GPU Nvidia GeForce 660. The developed software package for solving clustering problems by the method of K – means with using GPUs allows us to handle up to 2 million records with number of features up to 25. The gain in the computing time is in factor 5. We plan to increase this factor up to 20-30 after improving the algorithms.

Tags: Algorithms, Cluster analysis, Computer science, CUDA, nVidia, nVidia GeForce GTX 660

February 19, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Using of GPUs for cluster analysis of large data by K-means method

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Using of GPUs for cluster analysis of large data by K-means method

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)