Multi-Lingual Speech Recognition with Low-Rank Multi-Task Deep Neural Networks

Aanchan Mohan, Richard Rose
Department of Electrical and Computer Engineering, McGill University, Montreal, Canada
IEEE International Conference on Acoustics, Speech and Signal Processing, 2015


   title={Multi-Lingual Speech Recognition with Low-Rank Multi-Task Deep Neural Networks},

   author={Mohan, Aanchan and Rose, Richard},



Download Download (PDF)   View View   Source Source   



Multi-task learning (MTL) for deep neural network (DNN) multilingual acoustic models has been shown to be effective for learning parameters that are common or shared between multiple languages[1, 2]. In the MTL paradigm, the number of parameters in the output layer is large and scales with the number of languages used in training. This output layer becomes a computational bottleneck. For mono-lingual DNNs, low-rank matrix factorization (LRMF) of weight matrices have yielded large computational savings[3, 4]. The LRMF proposed in this work for MTL, is for the original language-specific block matrices to "share" a common matrix, with resulting low-rank language specific block matrices. The impact of LRMF is presented in two scenarios, namely: (a) improving performance in a target language when auxiliary languages are included during multi-lingual training; and (b) cross-language transfer to an unseen language with only 1 hour of transcribed training data. A 44% parameter reduction in the final layer, manifests itself in providing a lower memory footprint and faster training times. An experimental study shows that the LRMF multi-lingual DNN provides competitive performance compared to a full-rank multi-lingual DNN in both scenarios.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: