A Relational Database Approach to Bayesian Network Knowledge Discovery
Tzu-Tsung Wong, Chun-Nan Hsu, Chia-Che Ma
Keywords:
Bayesian networks, Dirichlet distribution, normalization,
relational databases
Abstract
A Bayesian network is a powerful formalism for decision-making and knowledge
discovery. An approach to Bayesian network training for large scaled
real-world applications is important. Bayesian network training includes the
following two steps. Experts first select appropriate parameters consistent
with their confidence to transform the conditional probabilities into
Dirichlet priors. Then the conditional posteriors for the variables in the
network can be obtained by Bayesian updating. In this paper, we present a
database scheme to store and manipulate a large Bayesian network as well as
training data sets in a relational database. This scheme facilitates
Bayesian network training and allows the system to take advantage of the
benefits of relational data models. Other features of this scheme are that
it tolerates incomplete training data and is generally applicable for
Bayesian networks with arbitrary graph structures. Since RDBMS are widely
available, this scheme greatly simplifies the construction of a Bayesian
network based KDD system.