DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

hgpu.org » Applications » Computer science » DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, Fahim Kawsar

Bell Labs

15th International Conference on Information Processing in Sensor Networks (IPSN ’16), 2016

BibTeX

Download (PDF)

View

Source

3130

views

Breakthroughs from the field of deep learning are radically changing how sensor data are interpreted to extract the high-level information needed by mobile apps. It is critical that the gains in inference accuracy that deep models afford become embedded in future generations of mobile apps. In this work, we present the design and implementation of DeepX, a software accelerator for deep learning execution. DeepX significantly lowers the device resources (viz. memory, computation, energy) required by deep learning that currently act as a severe bottleneck to mobile adoption. The foundation of DeepX is a pair of resource control algorithms, designed for the inference stage of deep learning, that: (1) decompose monolithic deep model network architectures into unit-blocks of various types, that are then more efficiently executed by heterogeneous local device processors (e.g., GPUs, CPUs); and (2), perform principled resource scaling that adjusts the architecture of deep models to shape the overhead each unit-blocks introduces. Experiments show, DeepX can allow even large-scale deep learning models to execute efficiently on modern mobile processors and significantly outperform existing solutions, such as cloud-based offloading.

Tags: Computer science, CUDA, Deep learning, Heterogeneous systems, nVidia, nVidia Jetson TK1, SoC

March 15, 2016 by hgpu

Rating: 2.5/5. From 2 votes.

Please wait...

high performance computing on graphics processing units: hgpu.org

DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

Share this:

Recent source codes

Most viewed papers (last 30 days)