Privacy-Preserving Bayesian Network Structure Computation on
Distributed Heterogeneous Data
Rebecca Wright, Stevens Institute of Technology
As more and more activities are carried out using computers and
computer networks, the amount of potentially sensitive data stored by
business, governments, and other parties increases. Different parties
may wish to benefit from cooperative use of their data, but privacy
regulations and other privacy concerns may prevent the parties from
sharing their data. Privacy-preserving data mining provides a
solution by creating distributed data mining algorithms in which the
underlying data is not revealed.
We present a privacy-preserving protocol for a particular data mining
task: learning the Bayesian network structure for distributed
heterogeneous data. In this setting, two parties owning confidential
databases wish to learn the structure of a Bayesian network on the
combination of their databases without revealing anything about their
data to each other. We give an efficient and privacy-preserving
version of the K2 algorithm to construct the structure of a Bayesian
network for the parties' joint data.
(Joint work with Zhiqiang Yang.)
Rebecca Wright
Gates 459 Friday 08/13/04 1500 hrs