high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » fairseq: A Fast, Extensible Toolkit for Sequence Modeling

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli

Facebook AI Research

arXiv:1904.01038 [cs.CL], (1 Apr 2019)

@misc{ott2019fairseq,

title={fairseq: A Fast, Extensible Toolkit for Sequence Modeling},

author={Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},

year={2019},

eprint={1904.01038},

archivePrefix={arXiv},

primaryClass={cs.CL}

}

Download (PDF)

View

Source

Source codes

Package:

fairseq: Facebook AI Research Sequence-to-Sequence Toolkit written in Python

2946

views

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found below.

Tags: Computer science, CUDA, NLP, nVidia, Package, PyTorch, Video

April 7, 2019 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

Package:

Your response

Recent source codes

ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

SeedFold: Scaling Biomolecular Structure Prediction

Tilus: A Tile-Level GPU Kernel Programming Language

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs

BoltzGen:Toward Universal Binder Design

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution

MATLAB Tensor Core models

TritonForge: Transform PyTorch Operations into Optimized GPU Kernels with LLMs

RLTune: Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

Most viewed papers (last 30 days)

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)