Abstract:
|
General finite mixture models are powerful tools for the density-based grouping of multivariate i.i.d. data, but they lack data visualization capabilities, which reduces their practical applicability to real-world problems. Generative topographic mapping (GTM) was originally formulated as a constrained mixture of distributions in order to provide simultaneous visualization and clustering of multivariate data. In its inception, the adaptive parameters were determined by maximum likelihood (ML), using the expectation-maximization (EM) algorithm. The original GTM is, therefore, prone to data overfitting unless a regularization mechanism is included. In this paper, we define an alternative variational formulation of GTM that provides a full Bayesian treatment to a Gaussian process (GP)-based variation of the model. The generalization capabilities of the proposed Variational Bayesian GTM are assessed in some detail and compared with those of alternative GTM regularization approaches in terms of test log-likelihood, using several artificial and real datasets. |