Deep Dynamic Neural Networks for Gesture Segmentation and Recognition

hgpu.org » Applications » Computer science » Deep Dynamic Neural Networks for Gesture Segmentation and Recognition

Deep Dynamic Neural Networks for Gesture Segmentation and Recognition

Di Wu, Ling Shao

The University of Sheffield

ECCV ChaLearn Looking at People Workshop, 2014

BibTeX

Download (PDF)

View

Source

Source codes

Package:

3DCNN_HMM

3291

views

The purpose of this paper is to describe a novel method called Deep Dynamic Neural Networks(DDNN) for the Track 3 of the Chalearn Looking at People 2014 challenge [1]. A generalised semi-supervised hierarchical dynamic framework is proposed for simultaneous gesture segmentation and recognition taking both skeleton and depth images as input modules. First, Deep Belief Networks(DBN) and 3D Convolutional Neural Networks (3DCNN) are adopted for skeletal and depth data accordingly to extract high level spatio-temporal features. Then the learned representations are used for estimating emission probabilities of the Hidden Markov Models to infer an action sequence. The framework can be easily extended by including an ergodic state to segment and recognise video sequences by a frame-to-frame mechanism, rendering it possible for online segmentation and recognition for diverse input modules. Some normalisation details pertaining to preprocessing raw features are also discussed. This purely data-driven approach achieves 0.8162 score in this gesture spotting challenge. The performance is on par with a variety of the state-of-the-art hand-tuned-feature approaches and other learning-based methods, opening the doors for using deep learning techniques to explore time series multimodal data.

Tags: CNN, Computer science, CUDA, Neural networks, nVidia, Package, Python, Rendering

October 4, 2014 by hgpu

No votes yet.

Please wait...

* * *

high performance computing on graphics processing units: hgpu.org