Título:
|
Spoken document retrieval based on approximated sequence alignment
|
Autor/a:
|
Comas Umbert, Pere Ramon; Turmo Borras, Jorge
|
Otros autores:
|
Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics; Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural |
Abstract:
|
This paper presents a new approach to spoken document information retrieval for spontaneous speech corpora. The classical approach to this problem is the use of an automatic speech recognizer (ASR) combined with standard information retrieval techniques. However, ASRs tend to produce transcripts of spontaneous speech with significant word error rate, which is a drawback for standard retrieval techniques. To overcome such a limitation, our method is based on an approximated sequence alignment algorithm to search “sounds like” sequences. Our approach does not depend on extra information from the ASR and outperforms up to 7 points the precision of state-of-the-art techniques in our experiments. |
Abstract:
|
Peer Reviewed |
Materia(s):
|
-Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural -Oral communication -- Data processing -Comunicació oral -- Informàtica |
Derechos:
|
|
Tipo de documento:
|
Artículo - Versión presentada Objeto de conferencia |
Compartir:
|
|