Privacy-Preserving Bayesian Network Structure Computation on Distributed Heterogeneous Data

Rebecca Wright, Stevens Institute of Technology

As more and more activities are carried out using computers and computer networks, the amount of potentially sensitive data stored by business, governments, and other parties increases. Different parties may wish to benefit from cooperative use of their data, but privacy regulations and other privacy concerns may prevent the parties from sharing their data. Privacy-preserving data mining provides a solution by creating distributed data mining algorithms in which the underlying data is not revealed. We present a privacy-preserving protocol for a particular data mining task: learning the Bayesian network structure for distributed heterogeneous data. In this setting, two parties owning confidential databases wish to learn the structure of a Bayesian network on the combination of their databases without revealing anything about their data to each other. We give an efficient and privacy-preserving version of the K2 algorithm to construct the structure of a Bayesian network for the parties' joint data. (Joint work with Zhiqiang Yang.)

Rebecca Wright

Gates 459 Friday 08/13/04 1500 hrs