Abstract:
|
The project I have realized consist in developing a gesture detection system to work at real time situations. In particular, it has the aim to detect a wink of an eye and activate a flag when that happens.
There are some actual projects and systems that already do that, but they are focused on voice detection. This project follows the same principles but it uses an input of video instead of a sound.
The creation of the pipeline was made in different parts. First of all, a convolutional neural network was created to detect the gesture in a sequence of images and it had to be trained to do so.
Secondly, a convolutional neural network for face detection was used as background subtraction, in order to select the main part of the image.
Finally, different methods of optimization were taken into account, so as to make the processing operations work faster.
A code was implemented to prove the background susbstraction of the image in order to reduce the processing time. Using this code, results were obtained about the accuracy and the processing time using Python. However we only obtained results from the part of background subtraction because the part of detecting the gesture was finally proposed as future work according to the lack of time and resources.
All in all, the results obtained were about the simplification of the image doing the background subtraction using a face detection method. We obtained that the time to detect the zone of the face took an average of 0.65 second.
Knowing that the system needs to take an image with the camera, do the background subtraction and process a convolutional neural network several times to detect a gesture, we deduce that the time that lasts the face detection makes that imposible.
It is needed to improve it more to make it work at real time. |