https://hgpu.org/?p=13769
Single stream parallelization of generalized LSTM-like RNNs on a GPU