Abstract:
|
This report focuses on classification and clustering methods
applied to a synthetic dataset, where we gradually increase the
missing data and tested---algorithms and tools (C4.5, Roc,
AutoClass, Kmeans and K2) for classification and clustering---
independently within each increment.
We report preliminary results that suggest that the methods for
classification and clustering based on Bayesian or probabilistic
structures work well in incomplete data domains and we can
intuitively say that it is possible make a kind of Bayesian
framework to efficiently share information among its components.
These results are not completely satisfactory and we need to test
different algorithms. That is, not only classification or
clustering, but also other as CBRs for example.
This experiment was done because we are motivated about how we
can make a general recommendation process description. |