high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Attention-based NMT Models as Feature Functions in Phrase-based SMT

Attention-based NMT Models as Feature Functions in Phrase-based SMT

Marcin Junczys-Dowmunt, Tomasz Dwojak, Rico Sennrich

Faculty of Mathematics and Computer Science, Adam Mickiewicz University in Poznan

arXiv:1605.04809 [cs.CL], (16 May 2016)

@article{junczys-dowmunt2016amuuedin,

title={The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT},

author={Junczys-Dowmunt, Marcin and Dwojak, Tomasz and Sennrich, Rico},

year={2016},

month={may},

archivePrefix={"arXiv"},

primaryClass={cs.CL}

}

Download (PDF)

View

Source

2147

views

This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, the phrase-based system cannot surpass state-of-the-art stand-alone neural models. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure-neural system by 1.1 BLEU points and our own phrase-based baseline by 1.6 BLEU. In follow-up experiments we improve these results by additional 0.7 BLEU.

Tags: Algorithms, Computer science, CUDA, Machine learning, NLP, nVidia, nVidia GeForce GTX 970

May 17, 2016 by hgpu

Rating: 2.5/5. From 2 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Attention-based NMT Models as Feature Functions in Phrase-based SMT

Your response

Recent source codes

Awesome LLM-Driven Kernel Generation

PhysProver: Advancing Automatic Theorem Proving for Physics

ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

SeedFold: Scaling Biomolecular Structure Prediction

Tilus: A Tile-Level GPU Kernel Programming Language

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs

BoltzGen:Toward Universal Binder Design

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution

MATLAB Tensor Core models

Most viewed papers (last 30 days)

Attention-based NMT Models as Feature Functions in Phrase-based SMT

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)