Abstract:
|
In this paper, we propose to discriminatively model target
and impostor spectral features using Deep Belief Networks
(DBNs) for speaker recognition. In the feature level, the number
of impostor samples is considerably large compared to
previous works based on i-vectors. Therefore, those i-vector
based impostor selection algorithms are not computationally
practical. On the other hand, the number of samples for each
target speaker is different from one speaker to another which
makes the training process more difficult. In this work, we
take advantage of DBN unsupervised learning to train a global
model, which will be referred to as Universal DBN (UDBN).
Then we adapt this UDBN to the data of each target speaker.
The evaluation is performed on the core test condition of the
NIST SRE 2006 database and it is shown that the proposed
architecture achieves more than 8% relative improvement in
comparison to the conventional Multilayer Perceptron (MLP). |