Title:
|
Improving detection of acoustic events using audiovisual data and feature level fusion
|
Author:
|
Butko, Taras; Canton Ferrer, Cristian; Segura, C.; Giró Nieto, Xavier; Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier; Casas Pla, Josep Ramon
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla; Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo |
Abstract:
|
The detection of the acoustic events (AEs) that are naturally
produced in a meeting room may help to describe the human
and social activity that takes place in it. When applied to
spontaneous recordings, the detection of AEs from only audio
information shows a large amount of errors, which are mostly due to temporal overlapping of sounds. In this paper, a system to detect and recognize AEs using both audio and video information is presented. A feature-level fusion strategy is used, and the structure of the HMM-GMM based system considers each class separately and uses a one-against-all strategy for training. Experime ntal AED results with a new and rather spontaneous dataset are presented which show the advantage of the proposed approach. |
Subject(s):
|
-Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Speech processing systems -Digital video -acoustic event detection -multimodality -multimodal fusion -hidden Markov models -acoustic localization -Reconeixement automàtic de la parla -Vídeo digital |
Rights:
|
|
Document type:
|
Article - Published version Conference Object |
Share:
|
|