Abstract:
|
Nowadays, the management of sequential and temporal data is an increasing need in many data mining processes. Therefore, the development of new privacy preserving data
mining techniques for sequential data is a crucial need to ensure that sequence data analysis is performed without disclosure
sensitive information. Although data analysis and protection are very different processes, they share a few common components such as similarity measurement.
In this paper we propose a new similarity function for categorical sequences of events based on OWA operators and fuzzy quantifiers. The main advantage of this new similarity function is the possibility of incorporating the user preferences in the similarity computation. We describe the implications of the application of different user preference policies in the similarity measurement when microaggregation, a wellknown data protection method, is applied to sequential data. |