high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Computer vision » Dogwild! – Distributed Hogwild for CPU & GPU

Dogwild! – Distributed Hogwild for CPU & GPU

Cyprien Noel, Simon Osindero

Flickr Vision & Machine Learning Group, Yahoo! Inc

Distributed Machine Learning and Matrix Computations, NIPS 2014 Workshop, 2014

@article{noel2014dogwild,

title={Dogwild!-Distributed Hogwild for CPU & GPU},

author={Noel, Cyprien and Osindero, Simon},

year={2014}

}

Download (PDF)

View

Source

3559

views

Deep learning has enjoyed tremendous success in recent years. Unfortunately, training large models can be very time consuming, even on GPU hardware. We describe a set of extensions to the state of the art Caffe library [3], allowing training on multiple threads and GPUs, and across multiple machines. Our focus is on architecture, implementing asynchronous SGD without increasing Caffe’s complexity. We isolate parallelization from Caffe’s existing SGD code, train unmodified models, and run on commodity hardware. Isolation is achieved by extending the Hogwild model, i.e. running parallel SGD solvers without synchronization, by also removing synchronization between solvers and components in charge of streaming gradients between nodes. In this modular design, components interact exclusively through unsynchronized reads and writes to the weight buffer. Each component is free to loop over the weights at a different pace, keeping both compute and network resources fully utilized. SGD’s resiliency against gradient loss allows further performance improvements by avoiding reliable network protocols. It enables the use of multicast messages, and of low level packets streaming through raw sockets or InfiniBand verbs. We show linear performance scaling for small clusters on MNIST, and early results on ImageNet.

Tags: Computer science, Computer vision, CUDA, Distributed computing, Machine learning, Neural networks, nVidia

November 9, 2014 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Dogwild! – Distributed Hogwild for CPU & GPU

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Dogwild! – Distributed Hogwild for CPU & GPU

Share this:

Recent source codes

Most viewed papers (last 30 days)