16872

Language Modeling with Gated Convolutional Networks

Yann N. Dauphin, Angela Fan, Michael Auli, David Grangier
Facebook AI Research
arXiv:1612.08083 [cs.CL], (23 Dec 2016)

@article{dauphin2016language,

   title={Language Modeling with Gated Convolutional Networks},

   author={Dauphin, Yann N. and Fan, Angela and Auli, Michael and Grangier, David},

   year={2016},

   month={dec},

   archivePrefix={"arXiv"},

   primaryClass={cs.CL}

}

Download Download (PDF)   View View   Source Source   

5785

views

The pre-dominant approach to language modeling to date is based on recurrent neural networks. In this paper we present a convolutional approach to language modeling. We introduce a novel gating mechanism that eases gradient propagation and which performs better than the LSTM-style gating of (Oord et al, 2016) despite being simpler. We achieve a new state of the art on WikiText-103 as well as a new best single-GPU result on the Google Billion Word benchmark. In settings where latency is important, our model achieves an order of magnitude speed-up compared to a recurrent baseline since computation can be parallelized over time. To our knowledge, this is the first time a non-recurrent approach outperforms strong recurrent models on these tasks.
Rating: 1.8/5. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: