15073
Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech–two vastly different languages. Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages. Key to our approach […]
Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines […]
Felix Weninger, Johannes Bergmann, Bjorn Schuller
In this article, we introduce CURRENNT, an open-source parallel implementation of deep recurrent neural networks (RNNs) supporting graphics processing units (GPUs) through NVIDIA’s Computed Unified Device Architecture (CUDA). CURRENNT supports uni- and bidirectional RNNs with Long Short-Term Memory (LSTM) memory cells which overcome the vanishing gradient problem. To our knowledge, CURRENNT is the first publicly […]
Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, […]
View View   Download Download (PDF)   
Johann Hauswald, Yiping Kang, Michael A. Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G. Dreslinski, Jason Mars, Lingjia Tang
As applications such as Apple Siri, Google Now, Microsoft Cortana, and Amazon Echo continue to gain traction, webservice companies are adopting large deep neural networks (DNN) for machine learning challenges such as image processing, speech recognition, natural language processing, among others. A number of open questions arise as to the design of a server platform […]
View View   Download Download (PDF)   
Aanchan Mohan, Richard Rose
Multi-task learning (MTL) for deep neural network (DNN) multilingual acoustic models has been shown to be effective for learning parameters that are common or shared between multiple languages[1, 2]. In the MTL paradigm, the number of parameters in the output layer is large and scales with the number of languages used in training. This output […]
View View   Download Download (PDF)   
Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, Eric S. Chung
Recent breakthroughs in the development of multi-layer convolutional neural networks have led to stateof-the-art improvements in the accuracy of non-trivial recognition tasks such as large-category image classification and automatic speech recognition [1]. These many-layered neural networks are large, complex, and require substantial computing resources to train and evaluate [2]. Unfortunately, these demands come at an […]
View View   Download Download (PDF)   
Leiming Yu, Yash Ukidave, David Kaeli
Speech recognition is used in a wide range of applications and devices such as mobile phones, in-car entertainment systems and web-based services. Hidden Markov Models (HMMs) is one of the most popular algorithmic approaches applied in speech recognition. Training and testing a HMM is computationally intensive and time-consuming. Running multiple applications concurrently with speech recognition […]
Leiming Yu, John Magrath, Ajey Pandey, Matthew Sears, David Kaeli
Speech Recognition run on Graphic Processing Units (GPUs) has shown some promising performance improvements ranging 2-10x speedups when compare to execution on CPUs. GPU has continued to introduce new programming features, such as Dynamic Parallelism and Hyper-Q, that could further benefit Speech Recognition processing. In this paper we describe a framework developed at Northeastern describing […]
View View   Download Download (PDF)   
Ming Zeng, Le T. Nguyen, Bo Yu, Ole J. Mengshoel, Jiang Zhu, Pang Wu, Joy Zhang
A variety of real-life mobile sensing applications are becoming available, especially in the life-logging, fitness tracking and health monitoring domains. These applications use mobile sensors embedded in smart phones to recognize human activities in order to get a better understanding of human behavior. While progress has been made, human activity recognition remains a challenging task. […]
View View   Download Download (PDF)   
Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng
We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model background noise, reverberation, […]
View View   Download Download (PDF)   
Josef Michalek, Jan Vanek
An extraction of feature-vectors from speech audio signal is a computationally intensive task. However, MFCC and PLP features remain the most popular for more than a decade. We made a GPU-accelerated implementation of the feature extraction processing. The implementation produces identical features as the reference Hidden Markov Toolkit (HTK) but in a fraction of the […]
View View   Download Download (PDF)   
Page 1 of 41234

* * *

* * *

Follow us on Twitter

HGPU group

1752 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

371 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: