high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Fast Speaker Diarization Using a High-Level Scripting Language

Fast Speaker Diarization Using a High-Level Scripting Language

Ekaterina Gonina, Gerald Friedland, Henry Cook, Kurt Keutzer

University of California, Berkeley

Automatic Speech Recognition and Understanding Workshop, 2011

@article{gonina2011fast,

title={Fast Speaker Diarization Using a High-Level Scripting Language},

author={Gonina, E. and Friedland, G. and Cook, H. and Keutzer, K.},

year={2011}

}

Download (PDF)

View

Source

Package:

Speaker Diarization

2576

views

Most current speaker diarization systems use agglomerative clustering of Gaussian Mixture Models (GMMs) to determine "who spoke when" in an audio recording. While stateof-the-art in accuracy, this method is computationally costly, mostly due to the GMM training, and thus limits the performance of current approaches to be roughly real-time. Increased sizes of current datasets require processing of hundreds of hours of data and thus make more efficient processing methods highly desirable. With the emergence of highly parallel multicore and manycore processors, such as graphics processing units (GPUs), one can re-implement GMM training to achieve faster than real-time performance by taking advantage of parallelism in the training computation. However, developing and maintaining the complex low-level GPU code is difficult and requires a deep understanding of the hardware architecture of the parallel processor. Furthermore, such low-level implementations are not readily reusable in other applications and not portable to other platforms, limiting programmer productivity. In this paper we present a speaker diarization system captured in under 50 lines of Python that achieves 50-250x faster than real-time performance by using a specialization framework to automatically map and execute computationally intensive GMM training on an NVIDIA GPU, without significant loss in accuracy.

Tags: Clustering, Computer science, CUDA, nVidia, nVidia GeForce GTX 480, Python, Speech recognition

October 31, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Fast Speaker Diarization Using a High-Level Scripting Language

Package:

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Fast Speaker Diarization Using a High-Level Scripting Language

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)