https://hgpu.org/?p=8843
Regularization and nonlinearities for neural language models: when are they needed?