A Conceptual Probabilistic Framework for Annotation Aggregation of Citizen Science Data

Publication date

2021-04-22T10:04:08Z

2021-04-22T10:04:08Z

2021-04-15

2021-04-22T10:04:08Z

Abstract

Over the last decade, hundreds of thousands of volunteers have contributed to science by collecting or analyzing data. This public participation in science, also known as citizen science, has contributed to significant discoveries and led to publications in major scientific journals. However, little attention has been paid to data quality issues. In this work we argue that being able to determine the accuracy of data obtained by crowdsourcing is a fundamental question and we point out that, for many real-life scenarios, mathematical tools and processes for the evaluation of data quality are missing. We propose a probabilistic methodology for the evaluation of the accuracy of labeling data obtained by crowdsourcing in citizen science. The methodology builds on an abstract probabilistic graphical model formalism, which is shown to generalize some already existing label aggregation models. We show how to make practical use of the methodology through a comparison of data obtained from different citizen science communities analyzing the earthquake that took place in Albania in 2019.

Document Type

Article


Published version

Language

English

Publisher

MDPI

Related items

Reproducció del document publicat a: https://doi.org/10.3390/math9080875

Mathematics, 2021, vol. 9, p. 875

https://doi.org/10.3390/math9080875

info:eu-repo/grantAgreement/EC/H2020/761758/EU//X5gon

info:eu-repo/grantAgreement/EC/H2020/952026/EU//HumanE-AI-Net

info:eu-repo/grantAgreement/EC/H2020/872944/EU//CROWD4SDG

Recommended citation

This citation was generated automatically.

Rights

cc-by (c) Cerquides Bueno, Jesús et al., 2021

http://creativecommons.org/licenses/by/3.0/es

This item appears in the following Collection(s)