Title:
|
The risks of mixing dependency lengths from sequences of different length
|
Author:
|
Ferrer Cancho, Ramon; Liu, Haitao
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació; Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge |
Abstract:
|
Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length. The distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors. |
Abstract:
|
Peer Reviewed |
Subject(s):
|
-Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural -Computational linguistics -Syntactic dependency -Syntax -Dependency length -Lingüística computacional |
Rights:
|
|
Document type:
|
Article - Published version Article |
Share:
|
|