Abstract:
|
A new method for gene expression classification is
proposed in this paper. In a first step, the original feature set is
enriched by including new features, called metagenes, produced
via hierarchical clustering. In a second step, a reliable classifier
is built from a wrapper feature selection process. The selection
relies on two criteria: the classical classification error rate and
a new reliability measure. As a result, a classifier with good
predictive ability using as few features as possible to reduce the
risk of overfitting is obtained. This method has been tested on
three public cancer datasets: leukemia, lymphoma and colon. The
proposed method has obtained interesting classification results
and the experiments have confirmed the utility of both metagenes
and feature ranking criterion to improve the final classifier |