A Relational Database Approach to Bayesian Network Knowledge Discovery

Tzu-Tsung Wong, Chun-Nan Hsu, Chia-Che Ma

 

psfileTR-IIS-99-007


Keywords:

Bayesian networks, Dirichlet distribution, normalization,

relational databases


Abstract

 

   A Bayesian network is a powerful formalism for decision-making and knowledge

   discovery. An approach to Bayesian network training for large scaled

   real-world applications is important. Bayesian network training includes the

   following two steps. Experts first select appropriate parameters consistent

   with their confidence to transform the conditional probabilities into

   Dirichlet priors. Then the conditional posteriors for the variables in the

   network can be obtained by Bayesian updating. In this paper, we present a

   database scheme to store and manipulate a large Bayesian network as well as

   training data sets in a relational database. This scheme facilitates

   Bayesian network training and allows the system to take advantage of the

   benefits of relational data models. Other features of this scheme are that

   it tolerates incomplete training data and is generally applicable for

   Bayesian networks with arbitrary graph structures. Since RDBMS are widely

   available, this scheme greatly simplifies the construction of a Bayesian

   network based KDD system.