Title:
|
Unsupervised spectral learning of FSTs
|
Author:
|
Bailly, Raphaël; Carreras Pérez, Xavier; Quattoni, Ariadna Julieta
|
Other authors:
|
Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural; Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge |
Abstract:
|
Finite-State Transducers (FST) are a standard tool for modeling paired input output sequences and are used in numerous applications, ranging from computational biology to natural language processing. Recently Balle et al. [4] presented a spectral algorithm for learning FST from samples of aligned input-output sequences. In this paper we address the more realistic, yet challenging setting where the alignments are unknown to the learning algorithm. We frame FST learning as finding a low rank Hankel matrix satisfying constraints derived from observable statistics. Under this formulation, we provide identifiability results for FST distributions. Then, following previous work on rank minimization, we propose a regularized convex relaxation of this objective which is based on minimizing a nuclear norm penalty subject to linear constraints and can be solved efficiently. |
Subject(s):
|
-Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural -Learning algorithms -Bioinformatics -Matrix algebra -Natural language processing systems -Relaxation processes -Computational biology -Finite state transducers -Linear constraints -Low-rank Hankel matrixes -Natural language processing -Rank minimizations -Spectral algorithm -Spectral learning -Algorismes d'aprenentage |
Rights:
|
|
Document type:
|
Article - Published version Conference Object |
Share:
|
|