Belief Networks


The set of conditional relationships is often expressed as a graph of nodes (events) linked together by directed (causes) and undirected (association) lines.  Because of the central role played by Bayes’ theorem in these calculations such graphs are sometimes called Bayesian nets although the more common term used is Belief Networks.

Bayes Theorem says that P(A|B) (the probability of an event A, given event B has occurred) is equal to:

P(B|A) P(A)

Sometimes in words this is expressed as the “posterior” is equal to the “likelihood” multiplied by the “prior” divided by the “evidence.” The key relationship is the proportionality of the posterior to the likelihood multiplied by the prior.

The power of Belief Networks is that they encode conditional independence.  If no path exists between two nodes then the events represented by those nodes are independent.  Directed links represent causal relationships, undirected links represent associations.  The calculus of Belief Networks is based on the removal or redirection of links associated with a node when information is known about the event that the node represents.  This removal and redirection reflects conditional independence.

For example, the example below encodes the features (Fi) are conditionally independent given the disease D. 

The mathematical expression for this graph is:

P(D,F1,…,Fn) = ( ∏ P(Fi|D) )  P(D)

Although simple, this kind of encoding of conditionally independent relationships has proven very useful in many applications including text classification and disambiguation of word senses because the calculations are very straightforward and experiments prove that the predictions are typically accurate.  It is called “Idiot’s Bayes” or, more graciously, “naïve Bayes.”

A small but interesting Belief Network is illustrated below.  It is a directed acyclic graph depicting some possible diseases that could lead to a blue baby.  The relationships and the (conditional) probabilities that are not shown were elicited after consultation with an expert pediatrician. This Belief Network is fairly typical – what we wish to conclude (probabilistically) are nodes that are nearer to the top – what we typically observe are nodes nearer to the bottom.  Observations may be expressed as probabilities.  A Belief Network may then be completed (that is, all remaining probabilities may be determined) through a series of graph manipulations and calculations. 


(From Probabilistic Networks and Expert Systems, Cowell, et al).

One use for Belief Networks is modeling risk.  Different risk factors may be identified and probabilities may be estimated from experience or from judgment.  A new situation that is presented may have known attributes and unknown attributes.  Regardless, probabilities of outcomes may be computed.


Belief Network Operations

Belief Networks are defined from models that are elicited from experts or automatically inferred.  The inference of network structure is a difficult and interesting area for research and will not be discussed further here.  Modeling involves the specification of the relevant variables (generally including both observable and hidden variables), the dependencies between the variables, and the component probabilities for the model.

After a model has been specified it is compiled into a structure more amenable to computation. The graph that encodes the conditional independence relationships constructed by an expert is transformed into an equivalent graph.  This involves recasting a directed graph (or a graph with both directed and undirected edges) into an entirely undirected graph. Then edges are added to this graph creating cliques that will enable the joint density of the entire graph to be transformed into a product of factors.

The inference process for a particular situation essentially binds a subset of the variables in the graph and uses the compiled graph to calculate the remainder of the variables through an local update algorithm. I recommend the book by Cowell, et al. as a very complete introduction to Belief Network ideas and algorithms and the book by Pearl as an excellent introduction to Probabilistic Logic.



An excellent introductory site is:

Cowell, R. G., Dawid, A. P., Lauritzen, S. L.  and Spiegelhalter, D. J.. "Probabilistic Networks and Expert Systems". Springer-Verlag. 1999.

Heckerman, David, "A Tutorial on Learning with Bayesian Networks," Technical Report MSR-TR-95-06, Microsoft Research, 1995.

Jensen, F. V. "Bayesian Networks and Decision Graphs". Springer. 2001.

Jordan, M. I.  (ed). "Learning in Graphical Models". MIT Press. 1998.

Pearl, Judea, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.