Synthesis using speaker adaptation from speech recognition DB

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/124661

Título:	Synthesis using speaker adaptation from speech recognition DB
Autor/a:	Oller Moreno, Sergio; Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio
Otros autores:	Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Abstract:	This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database design ed for training Automatic Speech Recognition (ASR) systems (Catalan SpeeCon database). The SpeeCon database was also used to adapt the SI model to different speakers. Using an ASR designed database for TTS purposes provided many different amateur voices, with few minutes of recordings not performed in studio conditions. This paper shows how speaker adaptation techniques provide the right tools to generate multiple voices with very few adaptation data. A subjective evaluation was carried out to assess the intelligibility and naturalness of the generated voices as well as the similarity of the adapted voices to both the original speaker and the average voice from the SI model.
Abstract:	Peer Reviewed
Materia(s):	-Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Automatic speech recognition -Speech synthesis -HMM -Adaptation -Reconeixement automàtic de la parla
Derechos:
Tipo de documento:	Artículo - Versión publicada Objeto de conferencia
Editor:	Universidad de Vigo
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Recent work on the FESTCAT database for speech synthesis

Bonafonte Cávez, Antonio; Esquerra Llucià, Ignasi; Aguilar, Lourdes; Oller Moreno, Sergio; Moreno Bilbao, M. Asunción

Corpus for cyberbullying prevention

Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio; Jauk, Igor; Tarrés, Laia; Pereira, Victor

Building synthetic voices in the METANET framework

Garcia Casademont, Emília; Bonafonte Cávez, Antonio; Moreno Bilbao, M. Asunción

BUCEADOR, a multi-language search engine for digital libraries

Adell Mercado, Jordi; Bonafonte Cávez, Antonio; Cardenal, Antonio; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción; Navas, Eva; Rodríguez Banga, Eduardo

Speech emotion recognition using hidden Markov models

Nogueiras Rodríguez, Albino; Mariño Acebal, José Bernardo; Bonafonte Cávez, Antonio; Moreno Bilbao, M. Asunción

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio