Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine

Rastgoo, Razieh; Kiani, Kourosh; Escalera Guerrero, Sergio

Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine

dc.contributor.author

Rastgoo, Razieh

dc.contributor.author

Kiani, Kourosh

dc.contributor.author

Escalera Guerrero, Sergio

dc.date.issued

2020-04-24T14:07:24Z

dc.date.issued

2020-04-24T14:07:24Z

dc.date.issued

2018-10-23

dc.date.issued

2020-04-24T14:07:24Z

dc.identifier

1424-8220

dc.identifier

https://hdl.handle.net/2445/157458

dc.identifier

682657

dc.description.abstract

In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey's Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.

dc.format

15 p.

dc.format

application/pdf

dc.language

eng

dc.publisher

MDPI

dc.relation

Reproducció del document publicat a: https://doi.org/10.3390/e20110809

dc.relation

Sensors, 2018, vol. 20, num. 11, p. 809

dc.relation

https://doi.org/10.3390/e20110809

dc.rights

cc-by (c) Rastgoo, Razieh et al., 2018

dc.rights

http://creativecommons.org/licenses/by/3.0/es

dc.rights

info:eu-repo/semantics/openAccess

dc.source

Articles publicats en revistes (Matemàtiques i Informàtica)

dc.subject

Llenguatge de signes

dc.subject

Sords

dc.subject

Aprenentatge

dc.subject

Sign language

dc.subject

Deaf

dc.subject

Learning

dc.title

Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine

dc.type

info:eu-repo/semantics/article

dc.type

info:eu-repo/semantics/publishedVersion

Fitxers en aquest element

Fitxers	Grandària	Format	Visualització
No hi ha fitxers associats a aquest element.

Aquest element apareix en la col·lecció o col·leccions següent(s)

ISGlobal - Institut de Salut Global de Barcelona [60807]

Matemàtiques i Informàtica [1007]