Abstract:
|
In many real problems that ultimately require data classification, not all the class labels are readily available. This concerns the field of semi-supervised learning, in which missing class labels must be inferred from the available ones as well as from the natural cluster structure of the data. This structure can sometimes be quite convoluted. Previous research has shown the advantage, for these cases, of using the geodesic metric in clustering models of the manifold learning family to reveal the
underlying true data structure. In this brief paper, we present a novel semi-supervised approach, namely Semi-Supervised Geo-GTM (SS-Geo-GTM). This is an extension of Geo-GTM, a variation on the Generative Topographic Mapping (GTM) manifold learning model for data clustering
and visualization that resorts to the geodesic metric. SS-Geo-GTM uses a proximity graph built from Geo-GTM manifold as the basis for a label propagation algorithm that infers missing class labels. Its performance
is compared to those of a semi-supervised version of the standard GTM and of the alternative Laplacian Eigenmaps method. |