Título:
|
N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination
|
Autor/a:
|
Khalilov, Maxim; Rodríguez Fonollosa, José Adrián
|
Otros autores:
|
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla |
Abstract:
|
In this paper we compare and contrast
two approaches to Machine Translation
(MT): the CMU-UKA Syntax Augmented
Machine Translation system (SAMT) and
UPC-TALP N-gram-based Statistical Machine
Translation (SMT). SAMT is a hierarchical
syntax-driven translation system
underlain by a phrase-based model and a
target part parse tree. In N-gram-based
SMT, the translation process is based on
bilingual units related to word-to-word
alignment and statistical modeling of the
bilingual context following a maximumentropy
framework. We provide a stepby-
step comparison of the systems and report
results in terms of automatic evaluation
metrics and required computational
resources for a smaller Arabic-to-English
translation task (1.5M tokens in the training
corpus). Human error analysis clarifies
advantages and disadvantages of the
systems under consideration. Finally, we
combine the output of both systems to
yield significant improvements in translation
quality. |
Materia(s):
|
-Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal -Natural language processing -Signal processing -Llenguatge natural (Informàtica) -- Processament -Traducció automàtica |
Derechos:
|
|
Tipo de documento:
|
Artículo - Versión publicada Objeto de conferencia |
Compartir:
|
|