To access the full text documents, please follow this link: http://hdl.handle.net/2117/9808

Simple semi-supervised dependency parsing
Koo, Terry; Carreras Pérez, Xavier; Collins, Michael
Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled second-order parsing, we improve from a baseline accuracy of 92:02% to 93:16%, and in the case of Czech unlabeled second-order parsing, we improve from a baseline accuracy of 86:13% to 87:13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of supervised data required to reach a desired level of performance.
Peer Reviewed
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
Computational linguistics
Lingüística computacional
info:eu-repo/semantics/submittedVersion
info:eu-repo/semantics/conferenceObject
         

Show full item record

Related documents

Other documents of the same author

Collins, Michael; Globerson, Amir; Koo, Terry; Carreras Pérez, Xavier; Bartlett, Peter
Koo, Terry; Globerson, Amir; Carreras Pérez, Xavier; Collins, Michael
Carreras Pérez, Xavier; Collins, Michael
Suzuki, Jun; Isozaki, Hideki; Carreras Pérez, Xavier; Collins, Michael
 

Coordination

 

Supporters