A proposal for Wide-Coverage Spanish Named Entity Recognition

dc.contributor.author
Arévalo, Montse
dc.contributor.author
Carreras, Xavier
dc.contributor.author
Martí Antonin, M. Antònia
dc.contributor.author
Màrquez, Lluís
dc.contributor.author
Padró, Lluís
dc.contributor.author
Simón, María José
dc.date.issued
2019-03-12T14:02:32Z
dc.date.issued
2019-03-12T14:02:32Z
dc.date.issued
2002
dc.date.issued
2019-03-12T14:02:32Z
dc.identifier
1135-5948
dc.identifier
https://hdl.handle.net/2445/130105
dc.identifier
508180
dc.description.abstract
This paper presents a proposal for wide--coverage Named Entity Recognition for Spanish. First, a linguistic description of the typology of Named Entities is proposed. Following this definition an architecture of sequential processes is described for addressing the recognition and classification of strong and weak Named Entities. The former are treated using Machine Learning techniques (AdaBoost) and simple attributes requiring non tagged corpora complemented with external information sources (a list of trigger words and a gazetteer). The latter are approached through a context free grammar for recognizing syntactic patterns. A deep evaluation of the first task on real corpora to validate the appropriateness of the approach is presented. A preliminar version of the context free grammar is qualitatively evaluated with also good results on a small hand--tagged corpus.
dc.format
18 p.
dc.format
application/pdf
dc.language
eng
dc.publisher
Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)
dc.relation
Reproducció del document publicat a: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/3305
dc.relation
Procesamiento del lenguaje natural , 2002, num. 28, p. 63-80
dc.rights
(c) Arévalo, Montse et al., 2002
dc.rights
info:eu-repo/semantics/openAccess
dc.source
Articles publicats en revistes (Filologia Catalana i Lingüística General)
dc.subject
Tractament del llenguatge natural (Informàtica)
dc.subject
Natural language processing (Computer science)
dc.title
A proposal for Wide-Coverage Spanish Named Entity Recognition
dc.type
info:eu-repo/semantics/article
dc.type
info:eu-repo/semantics/publishedVersion


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)