https://hgpu.org/?p=15883
Attention-based NMT Models as Feature Functions in Phrase-based SMT