high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

Xin He, Jiawen Liu, Zhen Xie, Hao Chen, Guoyang Chen, Weifeng Zhang, Dong Li

Hunan University Changsha, China

International Conference on Supercomputing (ICS ’21), 2021

@article{he2021enabling,

title={Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators},

author={He, Xin and Liu, Jiawen and Xie, Zhen and Chen, Hao and Chen, Guoyang and Zhang, Weifeng and Li, Dong},

year={2021}

}

Download (PDF)

View

Source

1414

views

DNN training consumes orders of magnitude more energy than inference and requires innovative use of accelerators to improve energy-efficiency. However, despite having complementary features, GPUs and FPGAs have been mostly used independently for the entire training process, thus neglecting the opportunity in assigning individual but distinct operations to the most suitable hardware. In this paper, we take the initiative to explore new opportunities and viable solutions in enabling energy-efficient DNN training on hybrid accelerators. To overcome fundamental challenges including avoiding training throughput loss, enabling fast design space exploration, and efficient scheduling, we propose a comprehensive framework, Hype-training, that utilizes a combination of offline characterization, performance modeling, and online scheduling of individual operations. Experimental tests using NVIDIA V100 GPUs and Intel Stratix 10 FPGAs show that, Hype-training is able to exploit a mixture of GPUs and FPGAs at a fine granularity to achieve significant energy reduction, by 44.3% on average and up to 59.7%, without any loss in training throughput. Hype-training can also enforce power caps more effectively than state-of-the-art power management mechanisms on GPUs.

Tags: Computer science, Deep learning, Design space exploration, FPGA, Hybrid computing, nVidia, OpenCL, Tesla V100

May 2, 2021 by hgpu

Rating: 5.0/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

Your response

Recent source codes

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

TRUST: a thermalhydraulic software package for CFD simulations

Modular: The Modular Platform (includes MAX & Mojo)

Allo: Accelerator Design Language

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization

Most viewed papers (last 30 days)

Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)