Introducing nativization to Spanish TTS systems

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/13690

Título:	Introducing nativization to Spanish TTS systems
Autor/a:	Polyakova, Tatyana; Bonafonte Cávez, Antonio
Otros autores:	Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Abstract:	In the modern world, speech technologies must be flexible and adaptable to any framework. Mass media globalization introduces multilingualism as a challenge for the most popular speech applications such as text-to-speech synthesis and automatic speech recognition. Mixed-language texts vary in their nature and when processed, some essential characteristics must be considered. In Spain and other Spanish-speaking countries, the use of Anglicisms and other words of foreign origin is constantly growing. A particularity of peninsular Spanish is that there is a tendency to nativize the pronunciation of non-Spanish words so that they fit properly into Spanish phonetic patterns. In our previous work, we proposed to use hand-crafted nativization tables that were capable of nativizing correctly 24% of words from the test data. In this work, our goal was to approach the nativization challenge by data-driven methods, because they are transferable to other languages and do not drop in performance in comparison with explicit rules manually written by experts. Training and test corpora for nativization consisted of 1000 and 100 words respectively and were crafted manually. Different specifications of nativization by analogy and learning from errors focused on finding the best nativized pronunciation of foreign words. The best obtained objective nativization results showed an improvement from 24% to 64% in word accuracy in comparison to our previous work. Furthermore, a subjective evaluation of the synthesized speech allowed for the conclusion that nativization by analogy is clearly the preferred method among listeners of different backgrounds when comparing to previously proposed methods. These results were quite encouraging and proved that even a small training corpus is sufficient for achieving significant improvements in naturalness for English inclusions of variable length in Spanish utterances.
Abstract:	Peer Reviewed
Materia(s):	-Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Speech synthesis -Reconeixement automàtic de la parla
Derechos:
Tipo de documento:	Artículo - Versión publicada Artículo
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Defining analogy for non-native inclusions in Spanish utterances

Polyakova, Tatyana; Bonafonte Cávez, Antonio

Nativization of English words in Spanish using analogy

Polyakova, Tatyana; Bonafonte Cávez, Antonio

A bilingual Spanish-Catalan database of units for concatenative synthesis

Esquerra Llucià, Ignasi; Bonafonte Cávez, Antonio; Vallverdú Bayés, Sisco; Febrer Godayol, Albert

Corpus for cyberbullying prevention

Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio; Jauk, Igor; Tarrés, Laia; Pereira, Victor

The demiphone: An efficient contextual subword unit for continuous speech recognition

Mariño Acebal, José Bernardo; Nogueiras Rodríguez, Albino; Pachés-Leal, Pau; Bonafonte Cávez, Antonio

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio