RASR/NN: The RWTH Neural Network Toolkit for Speech Recognition
Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, Aachen, Germany
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014
@article{simon2014rasr,
title={RASR/NN: THE RWTH NEURAL NETWORK TOOLKIT FOR SPEECH RECOGNITION},
author={Simon Wiesler, Alexander Richard and Golik, Pavel and Schl{"u}ter, Ralf and Ney, Hermann},
year={2014}
}
This paper describes the new release of RASR – the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The focus is put on the implementation of the NN module for training neural network acoustic models. We describe code design, configuration, and features of the NN module. The key feature is a high flexibility regarding the network topology, choice of activation functions, training criteria, and optimization algorithm, as well as a built-in support for efficient GPU computing. The evaluation of run-time performance and recognition accuracy is performed exemplary with a deep neural network as acoustic model in a hybrid NN/HMM system. The results show that RASR achieves a state-of-the-art performance on a real-world large vocabulary task, while offering a complete pipeline for building and applying large scale speech recognition systems.
March 9, 2014 by hgpu