high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Deep Voice 3: 2000-Speaker Neural Text-to-Speech

Deep Voice 3: 2000-Speaker Neural Text-to-Speech

Wei Ping, Kainan Peng, Andrew Gibiansky, Sercan O. Arik, Ajay Kannan, Sharan Narang, Jonathan Raiman, John Miller

OpenAI

arXiv:1710.07654 [cs.SD], (20 Oct 2017)

@article{ping2017deep,

title={Deep Voice 3: 2000-Speaker Neural Text-to-Speech},

author={Ping, Wei and Peng, Kainan and Gibiansky, Andrew and Arik, Sercan O. and Kannan, Ajay and Narang, Sharan and Raiman, Jonathan and Miller, John},

year={2017},

month={oct},

archivePrefix={"arXiv"},

primaryClass={cs.SD}

}

Download (PDF)

View

Source

5499

views

We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS) system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. We scale Deep Voice 3 to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, we identify common error modes of attention-based speech synthesis networks, demonstrate how to mitigate them, and compare several different waveform synthesis methods. We also describe how to scale inference to ten million queries per day on one single-GPU server.

Tags: Computer science, CUDA, Deep learning, nVidia, Speech recognition, Tesla P100

October 24, 2017 by hgpu

Rating: 5.0/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Deep Voice 3: 2000-Speaker Neural Text-to-Speech

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Deep Voice 3: 2000-Speaker Neural Text-to-Speech

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)