The d-separability algorithm in the Koller text answers the question, “From where in the network can information flow to a query node X?” or equivalently, “Which nodes, if observed, would influence X?” However, if a node is unobserved, it may be eligible for pruning even though it is not d-separable from X. Thus, when we know which nodes are evidence and which are queries, we can expand on the d-separability algorithm to build a more aggressive pruning algorithm.

We have two main issues to consider.

Consider the example where we have a Bayesian network with two nodes, $X$ and $\theta$, distributed as follows:

$\theta \sim N(\mu, \tau^2)\,$

$X \sim N(\theta, \sigma^2)\,$

where $\mu$, $\tau^2$, and $\sigma^2$ are constants.

Initially, it may appear possible to prune $\theta$: the node is unobserved, so it has no information to give to X. However, $\theta$ cannot actually be pruned. One way to look at this is to note that there is information in the parameters of $\theta$.

Rather than try to come up with complicated rules to figure out which nodes can be pruned, there are two simple solutions to pick from. The first approach is to create a node for any constant parameter. These nodes are observed, since they hold a fixed value, and they can be considered implicit evidence nodes. In this case, we would create a node for each of $\mu$, $\tau^2$, and $\sigma^2$. This ensures that the prior information is taken into account.

The second approach is to add an implicit evidence node as a parent to each node in the network, regardless of whether there is a constant parameter. This should give the same effect as the first approach (let me know if you have a counter-example).

In any case, nodes should not be pruned if evidence can flow through them to the query nodes, even if that “evidence” is really prior belief.

We would like to be able to prune all nodes that aren't either evidence nodes, or along some active path to an evidence node. The d-separability algorithm searches through the network and finds all reachable nodes but doesn't keep track of which paths led to each reachable node. We must modify the algorithm to keep track of paths.

The main change is to think of the reachable set as a mapping rather than a set. In the algorithm in the book, we only needed to know whether or not a node is in the set. For our modified algorithm, we also need a list of neighboring nodes for each reachable node. Thus, the reachable mapping maps from a reachable node to a list of its neighboring nodes. We will add nodes to this list as we search through the network.

When we pop items from the search queue, we will pop a (node, direction, previous) triple rather than a (node, direction) pair. The previous element is the node that we came from to get to the current node, so as soon as we pop from the search queue, we add previous to the reachable[node] list. Unlike in the book, we make this update to reachable before we check to see if (node, direction) is in the visited set because we need to add neighboring nodes to the list even if we have already found other nodes in this direction.

After completing the search, we can build up a set of nodes that cannot be pruned. We start by taking all reachable evidence nodes. We do not add *all* reachable nodes because many non-evidence reachable nodes can be pruned without any problem. However, we still need all nodes along the active paths, so we add all of the neighboring nodes for the reachable evidence nodes as well. We continue adding nodes along active paths by adding all neighboring nodes for the nodes we just added, and repeat this process until there are no new nodes to add.

If we have followed this process, we have found the list of all nodes along active paths to evidence, and we can prune any node that does not appear in this list.