Synthesis using speaker adaptation from speech recognition DB

Oller Moreno, Sergio; Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio; Oller Moreno, Sergio; Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio

Synthesis using speaker adaptation from speech recognition DB

Author

Oller Moreno, Sergio

Moreno Bilbao, M. Asunción

Bonafonte Cávez, Antonio

Other authors

Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions

Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla

Publication date

2010

Abstract

This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database design ed for training Automatic Speech Recognition (ASR) systems (Catalan SpeeCon database). The SpeeCon database was also used to adapt the SI model to different speakers. Using an ASR designed database for TTS purposes provided many different amateur voices, with few minutes of recordings not performed in studio conditions. This paper shows how speaker adaptation techniques provide the right tools to generate multiple voices with very few adaptation data. A subjective evaluation was carried out to assess the intelligibility and naturalness of the generated voices as well as the similarity of the adapted voices to both the original speaker and the average voice from the SI model.

Peer Reviewed

Postprint (published version)

Document Type

Conference lecture

Language

English

Subjects and keywords

Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic; Automatic speech recognition; Speech synthesis; HMM; Adaptation; Reconeixement automàtic de la parla

Publisher

Universidad de Vigo

Related items

http://fala2010.uvigo.es/images/proceedings/pdfs/0015.pdf

Recommended citation

This citation was generated automatically.

Export

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Rights

Open Access

This item appears in the following Collection(s)

E-prints [72986]

Synthesis using speaker adaptation from speech recognition DB

Author

Other authors

Publication date

Share

Abstract

Document Type

Language

Subjects and keywords

Publisher

Related items

Recommended citation

Export

Rights

This item appears in the following Collection(s)