Multi-source Data Certification

Glen Nuckolls (Joint work with Chip Martel and Stuart Stubblebine)

Ensuring the accuracy of content obtained by users from online providers is often complicated by a number of factors. Even if the provider of data is trusted, he may still be vulnerable to attacks. Communication reliability introduces additional vulnerabilities.

The fact that data providers often integrate data from multiple sources further complicates the problem.

Solutions to the problem of ensuring data accuracy for users when data comes from multiple sources likely exist using currently available mechanisms, but these are unlikely to be efficient or easily scalable and thus not suited to authenticating a large number of queries for a large number of users.

Authentic Publication provides an efficient and scalable way to ensure query integrity for users when the data originates from a single trusted source (Owner) but is provided by an untrusted third party Publisher.

This talk describes mechanisms which extend this type of efficient authentication to a setting which allows data to be collected from multiple sources by an untrusted Publisher. Owners can check that their data is being accurately represented in answers to users' queries without dealing with the entire combined data set, and users get guarantees that answers accurately reflect the Owners' data. The mechanisms described support membership queries as well as multi-attribute queries.

Applications of this work are numerous, including medical, scientific, government, and financial databases.

Gates 4B (opposite 490), 02/25/03, 4:30 PM