high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA

Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA

Sabine McConnell, Robert Sturgeon, Gregory Henry, Andrew Mayne, Richard Hurley

Department of Computing and Information Systems, Trent University, Peterborough, Ontario, Canada

Journal of Physics: Conference Series, 341, 012018, 2012

DOI:10.1088/1742-6596/341/1/012018

BibTeX

Download (PDF)

View

Source

2616

views

We evaluate a novel implementation of a Self-Organizing Map (SOM) on a Graphics Processing Unit (GPU) cluster. Using various combinations of OpenCL, CUDA, and two different graphics cards, we demonstrate the scalability of the SOM implementation on one to eight GPUs. Results indicate that while the algorithm scales well with the number of training samples and the map size, the benefits from using the data-parallel approaches offered by the GPU are severely limited when combined with the Message Passing Interface (MPI) in this setting, and comparable to speedups of GPU-based implementations as compared to optimized sequential code. Speedups achieved range from 3 to 32, for various map and training data sizes. We also observed a performance penalty for the OpenCL implementation as compared to CUDA.

Tags: Algorithms, Computer science, CUDA, Data mining, GPU cluster, nVidia, nVidia GeForce GT 220, OpenCL, Self-organizing map, Tesla S1070

February 11, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA

Your response

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)

Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)