high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Combining approximate inference methods for efficient learning on large computer clusters

Combining approximate inference methods for efficient learning on large computer clusters

Zhenwen Dai, Jacquelyn A. Shelton, Jorg Bornschein, Abdul Saboor Sheikh, Jorg Lucke

Frankfurt Institute for Advanced Studies,Germany

Workshop on Big Learning: Algorithms, Systems, and Tools for Learning at Scale (NIPS’11), 2011

@article{dai2012combining,

title={Combining approximate inference methods for efficient learning on large computer clusters},

author={Dai, Z. and Shelton, J.A. and Bornschein, J. and Sheikh, A.S. and L{"u}cke, J.},

year={2012}

}

Download (PDF)

View

Source

2099

views

An important challenge in machine learning is to develop learning algorithms that can handle large amounts of data at a realistically large scale. This entails not only the development of algorithms that can be efficiently trained to infer parameters of the model in a given dataset, but also demands careful thought about the tools (both software and hardware) used in their implementation. Based on the previously developed framework of parallel Expectation Maximization (EM) learning [1], we extend it to different models with corresponding parallelization techniques. To further tackle problems of computational complexity and to utilize the capability of the parallel computing hardware (CPU/GPU clusters), we developed a set of techniques which can be catered to specific large-scale learning problems. For instance, we design a dynamic data repartition technique for "Gaussian sparse coding" (Sec. 3.2), use specialized GPU kernels for translation invariant learning (Sec. 3.3), and show how sampling can be used to further scale the learning on very high dimensional data (Sec. 3.4). We propose these as examples of a parallelization toolbox which can be creatively combined and exploited in model-task driven ways. The framework is a lightweight and easy to use implementation of Python which facilitates the development of massive parallel machine learning algorithms using Message Passing Interface (MPI) for communication between the compute nodes. Once algorithms are integrated into the framework, they can be executed on large numbers of processor cores and can be applied to large sets of data. Some of the numerical experiments we performed ran on InfiniBand interconnected clusters and used up to 5000 parallel processor cores with more than 10^17 floating point operations. For reasonably balanced meta-parameters (number of data points vs. number of latent variables vs. number of model parameters to be inferred), we observe close to linear runtime scaling behavior with respect to the number of cores in use.

Tags: Algorithms, Computational Complexity, Computer science, GPU cluster, Machine learning, MPI, nVidia, nVidia GeForce GTX 480, OpenCL, Python

January 19, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Combining approximate inference methods for efficient learning on large computer clusters

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Combining approximate inference methods for efficient learning on large computer clusters

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)