TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing

Jin-Hwa Kim, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang
Interdisciplinary Program in Cognitive Science, Seoul National University
Proceedings of KIIS Spring Conference, Vol. 26, No. 1, 2016


   title={TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing},

   author={Kim, Jin-Hwa and Kim, Jeonghee and Ha, Jung-Woo and Zhang13, Byoung-Tak},

   booktitle={Proceedings of KIIS Spring Conference},





Deep learning framework supported by CUDA parallel computing platform boosts advances of studies on machine learning. The advantage of parallel processing largely comes from an efficiency of matrix-matrix multiplication using many CUDA-enabled graphics processing units (GPU). Therefore, for recurrent neural networks (RNNs), the usage of a zero-filled matrix representing variable lengths of sentences for a learning batch is forced for that reason, however, it is still true that these zeros are wasting computational resources. We propose an efficient algorithm which is trimming off zeros in the batch for RNNs providing the same result. The benchmark results validate our method with approximately 25% faster learning. Empirically, a natural language task confirms our results.
Rating: 1.8/5. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: