high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Parallel and Distributed Deep Learning

Parallel and Distributed Deep Learning

Vishakh Hegde, Sheema Usmani

Stanford University

Stanford University, 2016

@article{hegde2016parallel,

title={Parallel and Distributed Deep Learning},

author={Hegde, Vishakh and Usmani, Sheema},

year={2016}

}

Download (PDF)

View

Source

2376

views

The goal of this report is to explore ways to parallelize/distribute deep learning in multi-core and distributed setting. We have analyzed (empirically) the speedup in training a CNN using conventional single core CPU and GPU and provide practical suggestions to improve training times. In the distributed setting, we study and analyze synchronous and asynchronous weight update algorithms (like Parallel SGD, ADMM and Downpour SGD) and come up with worst case asymptotic communication cost and computation time for each of the these algorithms.

Tags: Caffe, CNN, Computer science, CUDA, Deep learning, nVidia, nVidia GRID K520

June 28, 2016 by hgpu

Rating: 1.0/5. From 4 votes.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Parallel and Distributed Deep Learning

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Parallel and Distributed Deep Learning

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)