Discurse Marker Characterisation Via Clustering: Extrapolation from Supervised to Unsupervised Corpora

Alonso, Laura; Castellón Masalles, Irene; Padró, Lluís; Gibert, Karina

Discurse Marker Characterisation Via Clustering: Extrapolation from Supervised to Unsupervised Corpora

dc.contributor.author

Alonso, Laura

dc.contributor.author

Castellón Masalles, Irene

dc.contributor.author

Padró, Lluís

dc.contributor.author

Gibert, Karina

dc.date.issued

2019-03-11T15:10:09Z

dc.date.issued

2019-03-11T15:10:09Z

dc.date.issued

2002

dc.date.issued

2019-03-11T15:10:10Z

dc.identifier

1135-5948

dc.identifier

https://hdl.handle.net/2445/130028

dc.identifier

514597

dc.description.abstract

In this paper we will show how clustering techniques provide empirical evidence for a characterisation of Discourse Markers (DMs) that helps in overcoming the lack of consensus and reduces the cost of building NLP resources based on DMs. By comparison of classifications from hand-tagged and unsupervised corpora we are capable of grounding a notion of DM prototypicality, from which reliable classifications can be obtained from fully unsupervised corpora.

dc.format

8 p.

dc.format

application/pdf

dc.format

application/pdf

dc.language

eng

dc.publisher

Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)

dc.relation

Reproducció del document publicat a: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/3257

dc.relation

Procesamiento del lenguaje natural , 2002, num. 29, p. 223-230

dc.rights

info:eu-repo/semantics/openAccess

dc.source

Articles publicats en revistes (Filologia Catalana i Lingüística General)

dc.subject

Tractament del llenguatge natural (Informàtica)

dc.subject

Marcadors del discurs

dc.subject

Natural language processing (Computer science)

dc.subject

Discourse markers

dc.title

Discurse Marker Characterisation Via Clustering: Extrapolation from Supervised to Unsupervised Corpora

dc.type

info:eu-repo/semantics/article

dc.type

info:eu-repo/semantics/publishedVersion

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Filologia Catalana i Lingüística General [949]

ISGlobal - Institut de Salut Global de Barcelona [60793]