Title:
|
A Richly annotated, multilingual parallel corpus for hybrid machine translation
|
Author:
|
Avramidis, Elefterios; Ruiz Costa-Jussà, Marta; Federmann, Christian; Melero, Maite; Pecina, Pavel; Van Genabith, Josef
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla |
Abstract:
|
In recent years, machine translation (MT) research has focused on investigating how hybrid machine translation as well as system combination approachescan bedesigned so that theresulting hybrid translationsshow an improvement over theindividual “component” translations. As a first step towards achieving this objectivewe have developed a parallel corpuswith source text and the corresponding translation output from a number of machine translation engines, annotated with metadata information, capturing aspects of the translation process performed by the different MT systems. This corpus aims to serve as a basic resource for further research on whether hybridmachinetranslation algorithmsand systemcombination techniques can benefit fromadditional (linguistically motivated, decoding, and runtime) information provided by thedifferent systems involved. In this paper, wedescribe the annotated corpuswehave created. We provide an overview on the component MT systems and the XLIFF-based annotation format we have developed. We also report on first experimentswith theML4HMT corpus data. |
Subject(s):
|
-Àrees temàtiques de la UPC::Informàtica -Machine translation -MachineTranslation -SystemCombination -Annotated Corpus -Traducció automàtica |
Rights:
|
http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
Document type:
|
Article - Published version Conference Object |
Published by:
|
European Language Resources Association (ELRA)
|
Share:
|
|