high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation

FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation

Songxiang Liu, Yuewen Cao, Na Hu, Dan Su, Helen Meng

Human-Computer Communications Laboratory, The Chinese University of Hong Kong

arXiv:2011.05731 [eess.AS], (11 Nov 2020)

@misc{liu2020fastsvc,

title={FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation},

author={Songxiang Liu and Yuewen Cao and Na Hu and Dan Su and Helen Meng},

year={2020},

eprint={2011.05731},

archivePrefix={arXiv},

primaryClass={eess.AS}

}

Download (PDF)

View

Source

1915

views

This paper presents FastSVC, a light-weight cross-domain sing voice conversion (SVC) system, which is able to achieve high conversion performance, with inference speed 4x faster than real time on CPUs. FastSVC uses Conformer based phoneme recognizer to extract singer-agnostic linguistic features from singing signals. A feature-wise linear modulation based generator is used to synthesize waveform directly from linguistic features, leveraging information from sine-excitation signals and loudness features. The waveform generator can be trained conveniently using a multi-resolution spectral loss and an adversarial loss. Experimental results show that the proposed FastSVC system, compared with a computationally heavy baseline system, can achieve comparable conversion performance in some scenarios and significantly better conversion performance in other scenarios. Moreover, the proposed FastSVC system achieves desirable cross-lingual singing conversion performance. Inference speed of the FastSVC system is 3x and 70x faster than the baseline system on GPUs and CPUs, respectively.

Tags: Computer science, nVidia, Speech recognition, Tesla P40

November 15, 2020 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)