Attention-based NMT Models as Feature Functions in Phrase-based SMT
Faculty of Mathematics and Computer Science, Adam Mickiewicz University in Poznan
arXiv:1605.04809 [cs.CL], (16 May 2016)
This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, the phrase-based system cannot surpass state-of-the-art stand-alone neural models. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure-neural system by 1.1 BLEU points and our own phrase-based baseline by 1.6 BLEU. In follow-up experiments we improve these results by additional 0.7 BLEU.
May 17, 2016 by hgpu