high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing

TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing

Jin-Hwa Kim, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang

Interdisciplinary Program in Cognitive Science, Seoul National University

Proceedings of KIIS Spring Conference, Vol. 26, No. 1, 2016

@inproceedings{kim2016trimzero,

title={TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing},

author={Kim, Jin-Hwa and Kim, Jeonghee and Ha, Jung-Woo and Zhang13, Byoung-Tak},

booktitle={Proceedings of KIIS Spring Conference},

volume={26},

number={1},

year={2016}

}

Download (PDF)

View

Source

Source codes

Package:

RNN: Recurrent Neural Network library for Torch7’s nn

1784

views

Deep learning framework supported by CUDA parallel computing platform boosts advances of studies on machine learning. The advantage of parallel processing largely comes from an efficiency of matrix-matrix multiplication using many CUDA-enabled graphics processing units (GPU). Therefore, for recurrent neural networks (RNNs), the usage of a zero-filled matrix representing variable lengths of sentences for a learning batch is forced for that reason, however, it is still true that these zeros are wasting computational resources. We propose an efficient algorithm which is trimming off zeros in the batch for RNNs providing the same result. The benchmark results validate our method with approximately 25% faster learning. Empirically, a natural language task confirms our results.

Tags: Computer science, CUDA, Deep learning, Lua, Machine learning, Neural networks, NLP, nVidia, Package, RNN, Torch

May 3, 2016 by hgpu

Rating: 1.8/5. From 3 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing

Package:

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)