GPU-Friendly Local Regression for Voice Conversion
Computer Science Division, University of California, Berkeley
Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT), 2015
@article{berg2015gpu,
title={GPU-Friendly Local Regression for Voice Conversion},
author={Berg-Kirkpatrick, Taylor and Klein, Dan},
year={2015}
}
Voice conversion is the task of transforming a source speaker’s voice so that it sounds like a target speaker’s voice. We present a GPUfriendly local regression model for voice conversion that is capable of converting speech in real-time and achieves state-of-the-art accuracy on this task. Our model uses a new approximation for computing local regression coefficients that is explicitly designed to preserve memory locality. As a result, our inference procedure is amenable to efficient implementation on the GPU. Our approach is more than 10X faster than a highly optimized CPU-based implementation, and is able to convert speech 2.7X faster than real-time.
June 24, 2015 by hgpu