Privacy through Accountability: The Case of Web Advertising

Anupam Datta


With the rapid increase in Web services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. In this talk, I will report on two of our recent results that Web services' companies can employ to improve their privacy compliance efforts and be accountable for their privacy promises.

First, I will present our joint work with Microsoft Research on building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) LEGALEASE -- a language that allows specification of privacy policies that impose restrictions on how user data is handled; and (b) GROK -- a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. GROK maps code-level schema elements to datatypes in LEGALEASE, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of big data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code in the data analytics pipeline for Bing written by several thousand developers.

Second, I will describe the problem of detecting personal data usage by websites when the analyst does not have access to the code of the system nor full control over the inputs or observability of all outputs of the system. A concrete example of this setting is one in which a privacy advocacy group, a government regulator, or a Web user may be interested in checking whether a particular web site uses certain types of personal information for advertising. I will present a methodology for information flow experiments based on experimental science and statistical analysis that addresses this problem, our tool AdFisher that incorporates this methodology, and findings of opacity, choice and discrimination from our experiments with Google.


Anupam Datta is an Associate Professor of Computer Science and Electrical & Computer Engineering at Carnegie Mellon University, and a former occupant of Gates 468.

Time and Place

Thursday, August 28, 4:15pm
Gates 463A