high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Scaling Recurrent Neural Network Language Models

Scaling Recurrent Neural Network Language Models

Will Williams, Niranjani Prasad, David Mrva, Tom Ash, Tony Robinson

Cantab Research, Cambridge, UK

arXiv:1502.00512 [cs.CL], (2 Feb 2015)

@article{williams2015scaling,

title={Scaling Recurrent Neural Network Language Models},

author={Williams, Will and Prasad, Niranjani and Mrva, David and Ash, Tom and Robinson, Tony},

year={2015},

month={feb},

archivePrefix={"arXiv"},

primaryClass={cs.CL}

}

Download (PDF)

View

Source

1798

views

This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.

Tags: Computer science, CUDA, Machine learning, Neural networks, nVidia, nVidia GeForce GTX Titan

February 3, 2015 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Scaling Recurrent Neural Network Language Models

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Scaling Recurrent Neural Network Language Models

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)