Abstract:
|
This paper describes the integral Knowle dge Discovery (KDD) process, including bot h prior expert knowledge and interpretation oriented tools to extract the behavior of a real pilot wastewater treatment plant. Special emphasis is made on the interest of developing postprocessing tools for clustering methods which can help the expert to unde rstand the meaning of the clusters and bridge the important existing gap between Data Mining and effective Decision Support. Traffic Lights Panel (TLP) is presented a s a suitable visual interpretation oriented tool for clustering results. Based on this tool, four typical behaviours are identifi ed in the pilot plant, which have been validated by the experts. Till now, the TLP is manually derived from the clustering results, but i t has been well accepted by the domain experts of several real applications as a very helpful contribution to understand the classes meaning and improve reliable decision-maki ng. Here, a proposal for automatic construc tion of TLP is presented trying to mimic the real process that the analyst perform s to manually build them. A criterion based on conditional Median as a central trend statistics of the variables inside a class is introduced and re fined to gain robustness towards outliers. Both criteria are tes ted and compared with the real target case study. A deep analysis of the advantages and draw backs of the proposed criterion, permitted to better understand the analyst process when manually building TLPs , to identify the scope of the proposal, and to typify some of the situations in which additional conditions are required. |