CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

hgpu.org » Applications » Computer science » CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi

University of Massachusetts Amherst, United States

arXiv:2105.05720 [cs.DC], (13 May 2021)

BibTeX

Download (PDF)

View

Source

1660

views

Modern deep learning workloads run on distributed hardware and are difficult to optimize — data, model, and pipeline parallelism require a developer to thoughtfully restructure their workload around optimized computation and communication kernels in libraries such as cuBLAS and NCCL. The logical separation between computation and communication leaves performance on the table with missed optimization opportunities across abstraction boundaries. To explore these opportunities, this paper presents CoCoNet, which consists of a compute language to express programs with both computation and communication, a scheduling language to apply transformations on such programs, and a compiler to generate high performance kernels. Providing both computation and communication as first class constructs enables new optimizations, such as overlapping or fusion of communication with computation. CoCoNet allowed us to optimize several data, model and pipeline parallel workloads in existing deep learning systems with very few lines of code. We show significant improvements after integrating novel CoCoNet generated kernels.

Tags: Code generation, Computer science, CUBLAS, CUDA, Deep learning, Machine learning, nVidia, nVidia DGX-2

May 23, 2021 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org