Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/23411

TweetNorm_es: an annotated corpus for Spanish microtext normalization
Alegria, Iñaki; Aranberri, Nora; Comas Umbert, Pere Ramon; Fresno, Víctor; Gamallo, Pablo; Padró, Lluís; San Vicente Roncal, Iñaki; Turmo Borras, Jorge; Zubiaga, Arkaitz
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics; Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
In this paper we introduce TweetNorm es, an annotated corpus of tweets in Spanish language, which we make publicly available under the terms of the CC-BY license. This corpus is intended for development and testing of microtext normalization systems. It was created for Tweet-Norm, a tweet normalization workshop and shared task, and is the result of a joint annotation effort from different research groups. In this paper we describe the methodology defined to build the corpus as well as the guidelines followed in the annotation process. We also present a brief overview of the Tweet-Norm shared task, as the first evaluation environment where the corpus was used.
Peer Reviewed
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
Spanish language -- 21st century
Microtext normalization
Twitter
phonology
Castellà -- Fonologia
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
European Language Resources Association (ELRA)
         

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Padró, Lluís; Turmo Borras, Jorge; Alegria, Iñaki; Aranberri, Nora; Fresno, Víctor; Samallo, Pablo; San Vicente, Iñaki; Zubiaga, Arkaitz
Ageno Pulido, Alicia; Comas Umbert, Pere Ramon; Padró, Lluís; Turmo Borras, Jorge
González Pellicer, Edgar; Rodríguez Hontoria, Horacio; Turmo Borras, Jorge; Comas Umbert, Pere Ramon; Mehdizadeh Naderi, Ali; Ageno Pulido, Alicia; Sapena Masip, Emilio; Vila Rigat, Marta; Martí, Maria Antònia
Comas Umbert, Pere Ramon; Turmo Borras, Jorge
Ageno Pulido, Alicia; Comas Umbert, Pere Ramon; Mehdizadeh Naderi, Ali; Rodríguez Hontoria, Horacio; Turmo Borras, Jorge