Privacy Preserving Indexing of Documents on the Network

Full textClick to download.
CitationProc. of Very Large Databases (VLDB), 2003, pp. 922-933
AuthorsMayank Bawa
Roberto Bayardo Jr.
Rakesh Agrawal


We address the problem of providing privacy-preserving search over distributed access- controlled content. Indexed documents can be easily reconstructed from conventional (inverted) indexes used in search. The need to avoid breaches of access-control through the index requires the index hosting site to be fully secured and trusted by all participating content providers. This level of trust is impractical in the increasingly common case where multiple competing organizations or individuals wish to selectively share content. We propose a solution that eliminates the need of such a trusted authority. The solution builds a centralized privacy-preserving index in conjunction with a distributed access-control enforcing search protocol. The new index provides strong and quantifiable privacy guarantees that hold even if the entire index is made public. Experiments on a real-life dataset validate performance on the scheme. The appeal of our solution is two-fold: (a) Content providers maintain complete control in defining access groups and ensuring its compliance, and (b) System implementors retain tunable knobs to balance privacy and efficiency concerns for their particular domains.

Back to publications
Back to previous page