Full text | Click to download. |
Citation | IEEE Transactions on Data Knowledge Engineering (TKDE), Vol. 18, No. 9, 2006. Earlier versions of parts of the work appeared in Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2004 and in the Proceedings of the International Workshop on Privacy Data Management (held in conjunction with ICDE '05), 2005.
|
Authors |
Zhiqiang Yang
Rebecca N. Wright |
Abstract-Traditionally, many data mining techniques have been designed in the centralized model in which all data is collected and available in one central site. However, as more and more activities are carried out using computers and computer networks, the amount of potentially sensitive data stored by business, governments, and other parties increases. Different parties often wish to benefit from cooperative use of their data, but privacy regulations and other privacy concerns may prevent the parties from sharing their data. Privacy-preserving data mining provides a solution by creating distributed data mining algorithms in which the underlying data need not be revealed. In this paper, we present privacy-preserving protocols for a particular data mining task: learning a Bayesian network from a database vertically partitioned among two parties. In this setting, two parties owning confidential databases wish to learn the Bayesian network on the combination of their databases without revealing anything else about their data to each other. We present an efficient and privacy-preserving protocol to construct a Bayesian network on the parties' joint data.