Abstract:
|
Microaggregation is a protection method used by statistical agencies to limit the disclosure risk of confidential information. Formally, microaggregation assigns each original datum to a small cluster and then
replaces the original data with the centroid of such cluster. As clusters contain at least k records, microaggregation can be considered as preserving k-anonymity. Nevertheless, this is only so when multivariate microaggregation is applied and, moreover, when all variables are microaggregated at the same time. When different variables are protected using univariate microaggregation, k-anonymity is only ensured at the variable level. Therefore, the real k-anonymity decreases for most of the records and it is then possible to cause a leakage of privacy. Due to this, the analysis of the disclosure risk is still meaningful in microaggregation.
This paper proposes a new record linkage method for univariate microaggregation
based on finding the optimal alignment between the original and the protected sorted variables. We show that our method, which uses a DTW distance to compute the optimal alignment, provides the intruder with enough information in many cases to to decide if the link is correct or not. Note that, standard record linkage methods never ensure the correctness of the linkage.
Furthermore, we present some experiments using two well-known data sets, which show that our method has better results (larger number of correct links) than the best standard record linkage method. |