Full text  Click to download. 
Citation  Journal of Privacy Technology

Authors  Gagan Aggarwal
Tomas Feder Krishnaram Kenthapadi Rajeev Motwani Rina Panigrahy Dilys Thomas An Zhu 
We consider the problem of releasing a table containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is kanonymization. A release is considered kanonymous if the information corresponding to any individual in the release cannot be distinguished from that of at least k1 other individuals whose information also appears in the release. In order to achieve kanonymization, some of the entries of the table are either suppressed or generalized (e.g. an Age value of 23 could be changed to the Age range 2025). The goal is to lose as little information as possible while ensuring that the release is kanonymous. This optimization problem is referred to as the kAnonymity problem. We show that the kAnonymity problem is NPhard even when the attribute values are ternary and we are allowed only to suppress entries. On the positive side, we provide an O(k)approximation algorithm for the problem. We also give improved positive results for the interesting cases with specific values of k  in particular, we give a 1.5approximation algorithm for the special case of 2Anonymity, and a 2approximation algorithm for 3Anonymity.