Full text | Click to download. |
Citation | Proceedings of the Tenth International Conference on Database Theory (ICDT) 2005, Springer LNCS vol. 3363, pp. 246-258.
|
Authors | Gagan Aggarwal
Tomas Feder Krishnaram Kenthapadi Rajeev Motwani Rina Panigrahy Dilys Thomas An Zhu |
We consider the problem of releasing tables containing personal records while ensuring individual privacy and data integrity. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information for each person contained in the release cannot be distinguished from at least k-1 other persons whose information also appears in the release. We show that the problem of k-anonymization is NP-hard even when the attribute values are ternary. On the positive side, we give an O(k)-approximation algorithm for the problem. This improves upon the previous best known O(k log k)-approximation [MW04]. In addition, we give a 1.5-approximation algorithm for the special case of 2-anonymity, and a 2-approximation algorithm for 3-anonymity.