Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/99750
Título: | Prosodic break prediction with RNNs |
---|---|
Autor/a: | Pascual de la Puente, Santiago; Bonafonte Cávez, Antonio |
Otros autores: | Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions; Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla |
Abstract: | Prosodic breaks prediction from text is a fundamental task to obtain naturalness in text to speech applications. In this work we build a data-driven break predictor out of linguistic features like the Part of Speech (POS) tags and forward-backward word distance to punctuation marks, and to do so we use a basic Recurrent Neural Network (RNN) model to exploit the sequence dependency in decisions. In the experiments we evaluate the performance of a logistic regression model and the recurrent one. The results show that the logistic regression outperforms the baseline (CART) by a 9.5% in the F-score, and the addition of the recurrent layer in the model further improves the predictions of the baseline by an 11%. |
Abstract: | Peer Reviewed |
Materia(s): | -Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic -Automatic speech recognition -Forecasting -Linguistics -Recurrent neural networks -Regression analysis -Linguistic features -Logistic Regression modeling -Logistic regressions -Part-of-speech tags -Prosodic breaks -Punctuation marks -Recurrent neural network (RNN) -Text to speech -Reconeixement automàtic de la parla |
Derechos: | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
Tipo de documento: | Artículo - Versión publicada Objeto de conferencia |
Editor: | Springer |
Compartir: |