Abstract:
|
In real applications, important rates of missing data are often found and have to be pre-processed before the analysis. The literature for missing imputation is abundant. However, the most precise imputation methods require long time, and sometimes speci c software; this implies a signi cant delay to get nal results. The Mixed Intelligent-Multivariate Missing Im-
putation (MIMMI) method is proposed as a hybrid missing imputation methodology based on clustering. MIMMI is a non parametric method that combines the prior expert knowledge
with multivariate analysis without requiring assumptions on the probabilistic models of the variables (normality, exponentiality, etc). The proposed imputation values implicitly take into account the joint distribution of all variables and can be determined in a relatively short time. MIMMI uses the conditional mean according to the self-underlying structure of the dataset. It provides a good trade-o between accuracy and both simplicity and required time to data preparation. The mechanics of the method is illustrated with some case-studies, both synthetic and real applications related with human behavior. In both cases, acceptable quality results were obtained in short time. |