Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D

Abstract

We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.

Document Type

Article


Accepted version

Language

English

Publisher

Elsevier B.V.

Related items

Versió postprint del document publicat a: https://doi.org/10.1016/j.patrec.2013.09.009

Pattern Recognition Letters, 2014, vol. 50, p. 112-121

https://doi.org/10.1016/j.patrec.2013.09.009

Recommended citation

This citation was generated automatically.

Rights

(c) Elsevier B.V., 2014

This item appears in the following Collection(s)