high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

Qingqing Cao, Niranjan Balasubramanian, Aruna Balasubramanian

Stony Brook University

arXiv:1706.00878 [cs.DC], (3 Jun 2017)

@article{cao2017mobirnn,

title={MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU},

author={Cao, Qingqing and Balasubramanian, Niranjan and Balasubramanian, Aruna},

year={2017},

month={jun},

archivePrefix={"arXiv"},

primaryClass={cs.DC}

}

Download (PDF)

View

Source

Source codes

Package:

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

2514

views

In this paper, we explore optimizations to run Recurrent Neural Network (RNN) models locally on mobile devices. RNN models are widely used for Natural Language Processing, Machine Translation, and other tasks. However, existing mobile applications that use RNN models do so on the cloud. To address privacy and efficiency concerns, we show how RNN models can be run locally on mobile devices. Existing work on porting deep learning models to mobile devices focus on Convolution Neural Networks (CNNs) and cannot be applied directly to RNN models. In response, we present MobiRNN, a mobile-specific optimization framework that implements GPU offloading specifically for mobile GPUs. Evaluations using an RNN model for activity recognition shows that MobiRNN does significantly decrease the latency of running RNN models on phones.

Tags: Computer science, CUDA, Deep learning, Java, Neural networks, nVidia, Package, RNN, TensorFlow

June 10, 2017 by hgpu

Rating: 1.8/5. From 3 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

Package:

Your response

Recent source codes

Awesome LLM-Driven Kernel Generation

PhysProver: Advancing Automatic Theorem Proving for Physics

ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

SeedFold: Scaling Biomolecular Structure Prediction

Tilus: A Tile-Level GPU Kernel Programming Language

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs

BoltzGen:Toward Universal Binder Design

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution

MATLAB Tensor Core models

Most viewed papers (last 30 days)

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)