Title:
|
Robust clustering of data collected via crowdsourcing
|
Author:
|
Pagès Zamora, Alba Maria; Giannakis, Georgios B.; López Valcarce, Roberto; Giménez Febrer, Pedro Juan
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. SPCOM - Grup de Recerca de Processament del Senyal i Comunicacions |
Abstract:
|
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works |
Abstract:
|
Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all an- notators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft as- signments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner. |
Abstract:
|
Peer Reviewed |
Subject(s):
|
-Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Automatic speech recognition -Crowdsourcing -Gaussian plus non-Gaussian mixture -Outlier -EM algorithm -Bayesian information criterion -Reconeixement automàtic de la parla |
Rights:
|
|
Document type:
|
Article - Submitted version Conference Object |
Share:
|
|