Randomization is the process of perturbing data points to provide security through information-theoretic uncertainty. It is appropriate in applications where individual data points are to be protected, but statistical information is to be revealed.
We propose that the binary randomization protocol be viewed as a channel for transmitting the information that is to be kept secret. We show that a simple application of the basic theorems of Shannon provides lower bounds on the complexity of successful attacks. We use these bounds to define the privacy provided by a protocol: the (tight) lower bound on the number of queries required to successfully estimate a single attribute. Through the use of Shannon's channel coding theorem, the privacy of randomization is the inverse of the protocol channel capacity.
Gates 4B (opposite 490), 10/22/02, 4:30 PM