py-bbn

pybbn logo.

py-bbn is a Python implementation of probabilistic and causal inference in Bayesian Belief Networks using exact inference algorithms [CGH97, Cow98, HD99, Kol09, Mur12].

You may install py-bbn from pypi.

pip install pybbn

If you like py-bbn, you might be interested in our next-generation products.

turing_bbn is a C++17 implementation of py-bbn; take your causal and probabilistic inferences to the next computing level!

turing_bbn logo.

pyspark-bbn is a is a scalable, massively parallel processing MPP framework for learning structures and parameters of Bayesian Belief Networks BBNs using Apache Spark.

pyspark-bbn logo.

Please contact us at info@oneoffcoder.com. Let’s reach for success!

Probabilistic Inference

The probabilistic inference algorithm used by py-bbn is an exact inference algorithm. Let’s go through an example on how to conduct exact inference.

Huang Graph

Below is the code to create the Huang Graph [HD99]. Note the typical procedure as follows.

  • create a Bayesian Belief Network (BBN)

  • create a junction tree from the graph

  • assert evidence

  • print out the marginal probabilities

Huang Bayesian Belief Network structure.

Huang Bayesian Belief Network structure.

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import Edge, EdgeType
 3from pybbn.graph.jointree import EvidenceBuilder
 4from pybbn.graph.node import BbnNode
 5from pybbn.graph.variable import Variable
 6from pybbn.pptc.inferencecontroller import InferenceController
 7
 8# create the nodes
 9a = BbnNode(Variable(0, 'a', ['on', 'off']), [0.5, 0.5])
10b = BbnNode(Variable(1, 'b', ['on', 'off']), [0.5, 0.5, 0.4, 0.6])
11c = BbnNode(Variable(2, 'c', ['on', 'off']), [0.7, 0.3, 0.2, 0.8])
12d = BbnNode(Variable(3, 'd', ['on', 'off']), [0.9, 0.1, 0.5, 0.5])
13e = BbnNode(Variable(4, 'e', ['on', 'off']), [0.3, 0.7, 0.6, 0.4])
14f = BbnNode(Variable(5, 'f', ['on', 'off']), [0.01, 0.99, 0.01, 0.99, 0.01, 0.99, 0.99, 0.01])
15g = BbnNode(Variable(6, 'g', ['on', 'off']), [0.8, 0.2, 0.1, 0.9])
16h = BbnNode(Variable(7, 'h', ['on', 'off']), [0.05, 0.95, 0.95, 0.05, 0.95, 0.05, 0.95, 0.05])
17
18# create the network structure
19bbn = Bbn() \
20    .add_node(a) \
21    .add_node(b) \
22    .add_node(c) \
23    .add_node(d) \
24    .add_node(e) \
25    .add_node(f) \
26    .add_node(g) \
27    .add_node(h) \
28    .add_edge(Edge(a, b, EdgeType.DIRECTED)) \
29    .add_edge(Edge(a, c, EdgeType.DIRECTED)) \
30    .add_edge(Edge(b, d, EdgeType.DIRECTED)) \
31    .add_edge(Edge(c, e, EdgeType.DIRECTED)) \
32    .add_edge(Edge(d, f, EdgeType.DIRECTED)) \
33    .add_edge(Edge(e, f, EdgeType.DIRECTED)) \
34    .add_edge(Edge(c, g, EdgeType.DIRECTED)) \
35    .add_edge(Edge(e, h, EdgeType.DIRECTED)) \
36    .add_edge(Edge(g, h, EdgeType.DIRECTED))
37
38# convert the BBN to a join tree
39join_tree = InferenceController.apply(bbn)
40
41# insert an observation evidence
42ev = EvidenceBuilder() \
43    .with_node(join_tree.get_bbn_node_by_name('a')) \
44    .with_evidence('on', 1.0) \
45    .build()
46join_tree.set_observation(ev)
47
48# print the posterior probabilities
49for node, posteriors in join_tree.get_posteriors().items():
50    p = ', '.join([f'{val}={prob:.5f}' for val, prob in posteriors.items()])
51    print(f'{node} : {p}')

A Bayesian Belief Network (BBN) is defined as a pair, G, P, where

  • G is a directed acylic graph (DAG)

  • P is a joint probability distribution

  • and G satisfies the Markov Condition (nodes are conditionally independent of non-descendants given its parents)

Ideally, the API should force the user to define G and P separately. However, there will be a bit of cognitive friction with this API as we define nodes associated with their local probability models (conditional probability tables) and then the structure afterwards. But this approach seems a bit more concise, no?

Updating Conditional Probability Tables

Sometimes, you may want to preserve the join tree structure and just update the condtional probability tables (CPTs). Here’s how to do so.

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import EdgeType, Edge
 3from pybbn.graph.node import BbnNode
 4from pybbn.graph.variable import Variable
 5from pybbn.pptc.inferencecontroller import InferenceController
 6
 7# you have built a BBN
 8a = BbnNode(Variable(0, 'a', ['t', 'f']), [0.2, 0.8])
 9b = BbnNode(Variable(1, 'b', ['t', 'f']), [0.1, 0.9, 0.9, 0.1])
10bbn = Bbn().add_node(a).add_node(b) \
11    .add_edge(Edge(a, b, EdgeType.DIRECTED))
12
13# you have built a junction tree from the BBN
14# let's call this "original" junction tree the left-hand side (lhs) junction tree
15lhs_jt = InferenceController.apply(bbn)
16
17# you may just update the CPTs with the original junction tree structure
18# the algorithm to find/build the junction tree is avoided
19# the CPTs are updated
20rhs_jt = InferenceController.reapply(lhs_jt, {0: [0.3, 0.7], 1: [0.2, 0.8, 0.8, 0.2]})
21
22# let's print out the marginal probabilities and see how things changed
23# print the marginal probabilities for the lhs junction tree
24print('lhs probabilities')
25# print the posterior probabilities
26for node, posteriors in lhs_jt.get_posteriors().items():
27    p = ', '.join([f'{val}={prob:.5f}' for val, prob in posteriors.items()])
28    print(f'{node} : {p}')
29
30# print the marginal probabilities for the rhs junction tree
31print('rhs probabilities')
32for node, posteriors in rhs_jt.get_posteriors().items():
33    p = ', '.join([f'{val}={prob:.5f}' for val, prob in posteriors.items()])
34    print(f'{node} : {p}')

Note that we use InferenceController.reapply(...) to apply the new CPTs to a previous one and that we get a new junction tree as an output.

Gaussian Inference

Inference on a Gaussian Bayesian Network (GBN) is accomplished through updating the means and covariance matrix incrementally [CGH97]. The following GBN comes from [Cow98].

Cowell GBN structure.

Cowell GBN structure.

The variables come from the following Gaussian distributions.

  • \(Y = \mathcal{N}(0, 1)\)

  • \(X = \mathcal{N}(Y, 1)\)

  • \(Z = \mathcal{N}(Z, 1)\)

Below is a code sample of how we can perform inference on this GBN.

 1import numpy as np
 2
 3from pybbn.gaussian.inference import GaussianInference
 4
 5
 6def get_cowell_data():
 7    """
 8    Gets Cowell data.
 9
10    :return: Data and headers.
11    """
12    n = 10000
13    Y = np.random.normal(0, 1, n)
14    X = np.random.normal(Y, 1, n)
15    Z = np.random.normal(X, 1, n)
16
17    D = np.vstack([Y, X, Z]).T
18    return D, ['Y', 'X', 'Z']
19
20
21# assume we have data and headers (variable names per column)
22# X is the data (rows are observations, columns are variables)
23# H is just a list of variable names
24X, H = get_cowell_data()
25
26# then we can compute the means and covariance matrix easily
27M = X.mean(axis=0)
28E = np.cov(X.T)
29
30# the means and covariance matrix are all we need for gaussian inference
31# notice how we keep `g` around?
32# we'll use `g` over and over to do inference with evidence/observations
33g = GaussianInference(H, M, E)
34# {'Y': (0.00967, 0.98414), 'X': (0.01836, 2.02482), 'Z': (0.02373, 3.00646)}
35print(g.P)
36
37# we can make a single observation with do_inference()
38g1 = g.do_inference('X', 1.5)
39# {'X': (1.5, 0), 'Y': (0.76331, 0.49519), 'Z': (1.51893, 1.00406)}
40print(g1.P)
41
42# we can make multiple observations with do_inferences()
43g2 = g.do_inferences([('Z', 1.5), ('X', 2.0)])
44# {'Z': (1.5, 0), 'X': (2.0, 0), 'Y': (1.00770, 0.49509)}
45print(g2.P)

Causal Inference

Average Causal Effect

Here’s how you may estimate the Average Causal Effect ACE using Pearl’s do-operator [Pea88, Pea00, Pea16, Pea18]. In this example, we want to estimate the ACE of drug on recovery where recovery is true.

Z is confounding X and Y.

Z is confounding X and Y.

 1from pybbn.causality.ace import Ace
 2from pybbn.graph.dag import Bbn
 3from pybbn.graph.edge import Edge, EdgeType
 4from pybbn.graph.node import BbnNode
 5from pybbn.graph.variable import Variable
 6
 7# create a BBN
 8gender_probs = [0.49, 0.51]
 9drug_probs = [0.23323615160349853, 0.7667638483965015,
10              0.7563025210084033, 0.24369747899159663]
11recovery_probs = [0.31000000000000005, 0.69,
12                  0.27, 0.73,
13                  0.13, 0.87,
14                  0.06999999999999995, 0.93]
15
16X = BbnNode(Variable(1, 'drug', ['false', 'true']), drug_probs)
17Y = BbnNode(Variable(2, 'recovery', ['false', 'true']), recovery_probs)
18Z = BbnNode(Variable(0, 'gender', ['female', 'male']), gender_probs)
19
20bbn = Bbn() \
21    .add_node(X) \
22    .add_node(Y) \
23    .add_node(Z) \
24    .add_edge(Edge(Z, X, EdgeType.DIRECTED)) \
25    .add_edge(Edge(Z, Y, EdgeType.DIRECTED)) \
26    .add_edge(Edge(X, Y, EdgeType.DIRECTED))
27
28# compute the ACE
29ace = Ace(bbn)
30results = ace.get_ace('drug', 'recovery', 'true')
31t = results['true']
32f = results['false']
33average_causal_impact = t - f

Serialization/Deserialization

We all need a way to save (serialize) and load (deserialize) our Bayesian Belief Networks (BBNs) and join trees (JTs). Here’s how to do so. Note that serde (serialization/deserialization) features are just writing to JSON or CSV formats and loading back from the such files. The code takes care of the serde process.

Serializing a BBN

JSON Serialization Format

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import Edge, EdgeType
 3from pybbn.graph.node import BbnNode
 4from pybbn.graph.variable import Variable
 5
 6# create graph
 7a = BbnNode(Variable(0, 'a', ['t', 'f']), [0.2, 0.8])
 8b = BbnNode(Variable(1, 'b', ['t', 'f']), [0.1, 0.9, 0.9, 0.1])
 9bbn = Bbn().add_node(a).add_node(b) \
10    .add_edge(Edge(a, b, EdgeType.DIRECTED))
11
12# serialize
13Bbn.to_json(bbn, 'simple-bbn.json')

You will get a file simple-bbn.json written out with the following content.

 1{
 2  "nodes": {
 3    "0": {
 4      "probs": [
 5        0.2,
 6        0.8
 7      ],
 8      "variable": {
 9        "id": 0,
10        "name": "a",
11        "values": [
12          "t",
13          "f"
14        ]
15      }
16    },
17    "1": {
18      "probs": [
19        0.1,
20        0.9,
21        0.9,
22        0.1
23      ],
24      "variable": {
25        "id": 1,
26        "name": "b",
27        "values": [
28          "t",
29          "f"
30        ]
31      }
32    }
33  },
34  "edges": [
35    {
36      "pa": 0,
37      "ch": 1
38    }
39  ]
40}

CSV Serialization Format

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import Edge, EdgeType
 3from pybbn.graph.node import BbnNode
 4from pybbn.graph.variable import Variable
 5
 6# create graph
 7a = BbnNode(Variable(0, 'a', ['t', 'f']), [0.2, 0.8])
 8b = BbnNode(Variable(1, 'b', ['t', 'f']), [0.1, 0.9, 0.9, 0.1])
 9bbn = Bbn().add_node(a).add_node(b) \
10    .add_edge(Edge(a, b, EdgeType.DIRECTED))
11
12# serialize
13Bbn.to_csv(bbn, 'simple-bbn.csv')

You will get a file simple-bbn.csv written out with the following content.

10,a,t,f,|,0.2,0.8
21,b,t,f,|,0.1,0.9,0.9,0.1
30,1,directed

Deserializing a BBN

JSON Deserialization Format

1from pybbn.graph.dag import Bbn
2
3# deserialize
4bbn = Bbn.from_json('simple-bbn.json')

CSV Deserialization Format

1from pybbn.graph.dag import Bbn
2
3# deserialize
4bbn = Bbn.from_csv('simple-bbn.csv')

Join Tree Serde

A join tree may also be serialized and deserialized. Only json format is supported for now.

Serializing a Join Tree

 1import json
 2
 3from pybbn.graph.dag import Bbn
 4from pybbn.graph.edge import EdgeType, Edge
 5from pybbn.graph.jointree import JoinTree
 6from pybbn.graph.node import BbnNode
 7from pybbn.graph.variable import Variable
 8from pybbn.pptc.inferencecontroller import InferenceController
 9
10a = BbnNode(Variable(0, 'a', ['t', 'f']), [0.2, 0.8])
11b = BbnNode(Variable(1, 'b', ['t', 'f']), [0.1, 0.9, 0.9, 0.1])
12bbn = Bbn().add_node(a).add_node(b) \
13    .add_edge(Edge(a, b, EdgeType.DIRECTED))
14jt = InferenceController.apply(bbn)
15
16with open('simple-join-tree.json', 'w') as f:
17    d = JoinTree.to_dict(jt, bbn)
18    j = json.dumps(d, sort_keys=True, indent=2)
19    f.write(j)

You will get a file simple-join-tree.json written out with the following content.

 1{
 2  "bbn_nodes": {
 3    "0": {
 4      "probs": [
 5        0.2,
 6        0.8
 7      ],
 8      "variable": {
 9        "id": 0,
10        "name": "a",
11        "values": [
12          "t",
13          "f"
14        ]
15      }
16    },
17    "1": {
18      "probs": [
19        0.1,
20        0.9,
21        0.9,
22        0.1
23      ],
24      "variable": {
25        "id": 1,
26        "name": "b",
27        "values": [
28          "t",
29          "f"
30        ]
31      }
32    }
33  },
34  "jt": {
35    "edges": [],
36    "nodes": {
37      "0-1": {
38        "node_ids": [
39          0,
40          1
41        ],
42        "type": "clique"
43      }
44    },
45    "parent_info": {
46      "0": [],
47      "1": [
48        0
49      ]
50    }
51  }
52}

Deserializing a Join Tree

 1import json
 2
 3from pybbn.graph.jointree import JoinTree
 4from pybbn.pptc.inferencecontroller import InferenceController
 5
 6with open('simple-join-tree.json', 'r') as f:
 7    j = f.read()
 8    d = json.loads(j)
 9    jt = JoinTree.from_dict(d)
10    jt = InferenceController.apply_from_serde(jt)

Generating Bayesian Belief Networks

Let’s generate some Bayesian Belief Networks (BBNs). The algorithms are taken from Random Generation of Bayesian Networks [IC02]. There are two types of BBNs you may generate.

  • singly-connected

  • multi-connected

A singly-connected BBN is one, where ignoring the direction of the edges, there is at most one path between any two nodes. A multi-connected BBN is one that is not singly-connected.

Singly-connected network structure.

Singly-connected network structure.

Multi-connected network structure.

Multi-connected network structure. There are two paths between C and F: (C, D, F) and (C, E, F).

Singly-Connected

The key method to use here is generate_singly_bbn.

 1import numpy as np
 2
 3from pybbn.generator.bbngenerator import generate_singly_bbn, convert_for_exact_inference, convert_for_drawing
 4
 5# very important to set the seed for reproducible results
 6np.random.seed(37)
 7
 8# this method generates the graph, g, and probabilities, p
 9# note we are generating a singly-connected graph
10g, p = generate_singly_bbn(5, max_iter=5)
11
12# you have to convert g and p to a BBN
13bbn = convert_for_exact_inference(g, p)
14
15# you can convert the BBN to a nx graph for visualization
16nx_graph = convert_for_drawing(bbn)

Multi-Connected

The key method to use here is generate_multi_bbn.

 1import numpy as np
 2
 3from pybbn.generator.bbngenerator import generate_multi_bbn, convert_for_exact_inference, convert_for_drawing
 4
 5# very important to set the seed for reproducible results
 6np.random.seed(37)
 7
 8# this method generates the graph, g, and probabilities, p
 9# note we are generating a multi-connected graph
10g, p = generate_multi_bbn(5, max_iter=5)
11
12# you have to convert g and p to a BBN
13bbn = convert_for_exact_inference(g, p)
14
15# you can convert the BBN to a nx graph for visualization
16nx_graph = convert_for_drawing(bbn)

Direct Generation

In the case where you do NOT need a reference to the BBN objects, use the API’s convenience method to generate and serialize the BBN directly to file.

 1import numpy as np
 2
 3from pybbn.generator.bbngenerator import generate_bbn_to_file
 4
 5# set the seed for reproducibility
 6np.random.seed(37)
 7
 8# generate a singly-connected BBN
 9generate_bbn_to_file(n=10, file_path='singly-bbn.csv', bbn_type='singly', max_alpha=10)
10
11# generate a multi-connected BBN
12generate_bbn_to_file(n=10, file_path='multi-bbn.csv', bbn_type='multi', max_alpha=10)

Here’s the output for singly-bbn.csv.

 10,0,state0,state1,|,0.5495149877004699,0.4504850122995299
 21,1,state0,state1,|,0.35835359558290997,0.64164640441709,0.8660444980250707,0.13395550197492936
 32,2,state0,state1,|,0.5828348518985648,0.4171651481014352,0.6352808281847757,0.3647191718152243
 43,3,state0,state1,|,0.43155247482552955,0.5684475251744704,0.05744110250902426,0.9425588974909757,0.44585399607259946,0.5541460039274007,0.286749915005319,0.713250084994681
 54,4,state0,state1,|,0.3190576398549361,0.6809423601450639,0.011424133320075755,0.9885758666799241
 65,5,state0,state1,|,0.48207371043602226,0.5179262895639779,0.07147107402394111,0.9285289259760588
 76,6,state0,state1,|,0.2076134466833406,0.7923865533166594,0.44542849473036455,0.5545715052696354
 87,7,state0,state1,|,0.757560101942848,0.242439898057152
 98,8,state0,state1,|,0.1906328058926942,0.8093671941073058,0.2814000588799281,0.7185999411200719
109,9,state0,state1,|,0.7854793106243432,0.2145206893756569,0.12392098364527641,0.8760790163547235
110,1,directed
121,2,directed
132,3,directed
143,4,directed
153,8,directed
165,6,directed
175,3,directed
187,5,directed
198,9,directed

Here’s the output for multi-bbn.csv.

 10,0,state0,state1,|,0.680874572938313,0.319125427061687
 21,1,state0,state1,|,0.7617263477727293,0.23827365222727065,0.3117227721913154,0.6882772278086846
 32,2,state0,state1,|,0.12614472921860395,0.8738552707813961,0.7070911105993563,0.29290888940064375
 43,3,state0,state1,|,0.4055587320025024,0.5944412679974977,0.9624106996627307,0.037589300337269156
 54,4,state0,state1,|,0.31986562609614827,0.6801343739038517,0.022365118374575416,0.9776348816254246
 65,5,state0,state1,|,0.77366174354673,0.2263382564532701,0.8579513677510221,0.1420486322489778,0.3183725110598738,0.6816274889401261,0.04262514631905535,0.9573748536809447
 76,6,state0,state1,|,0.05830032685169777,0.9416996731483022,0.5840685338695271,0.41593146613047294,0.7078930065265004,0.29210699347349944,0.490562272424676,0.509437727575324
 87,7,state0,state1,|,0.7569425298012309,0.243057470198769,0.6536654079476188,0.3463345920523811,0.6299885487124776,0.3700114512875224,0.4929042112083024,0.5070957887916976
 98,8,state0,state1,|,0.3295640257593744,0.6704359742406256,0.9098731919901998,0.09012680800980029
109,9,state0,state1,|,0.7804943261233692,0.21950567387663072,0.43963638923803844,0.5603636107619615,0.03168532379450399,0.968314676205496,0.7189237718440259,0.28107622815597405,0.356320337335263,0.643679662664737,0.8089559692517324,0.19104403074826756,0.520364955519572,0.47963504448042804,0.3989706528653481,0.601029347134652
110,1,directed
120,9,directed
130,5,directed
141,2,directed
152,3,directed
163,4,directed
174,5,directed
184,6,directed
194,7,directed
205,6,directed
216,7,directed
226,9,directed
237,8,directed
248,9,directed

Sampling Data

Sampling data from a BBN is possible. The algorithm uses logic sampling with rejection [Hen88].

Simple Sampling

This code demonstrates simple sampling.

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import Edge, EdgeType
 3from pybbn.graph.node import BbnNode
 4from pybbn.graph.variable import Variable
 5from pybbn.sampling.sampling import LogicSampler
 6
 7a = BbnNode(Variable(0, 'a', ['on', 'off']), [0.5, 0.5])
 8b = BbnNode(Variable(1, 'b', ['on', 'off']), [0.5, 0.5, 0.4, 0.6])
 9c = BbnNode(Variable(2, 'c', ['on', 'off']), [0.7, 0.3, 0.2, 0.8])
10
11bbn = Bbn() \
12    .add_node(a) \
13    .add_node(b) \
14    .add_node(c) \
15    .add_edge(Edge(a, b, EdgeType.DIRECTED)) \
16    .add_edge(Edge(b, c, EdgeType.DIRECTED))
17
18sampler = LogicSampler(bbn)
19samples = sampler.get_samples(n_samples=10000, seed=37)

Sampling with Rejection

This code demonstrates sampling with evidence asserted. During each round of sampling, if the sample value generated does not match with the evidence, the entire sample is discarded.

 1from pybbn.graph.dag import Bbn
 2from pybbn.graph.edge import Edge, EdgeType
 3from pybbn.graph.node import BbnNode
 4from pybbn.graph.variable import Variable
 5from pybbn.sampling.sampling import LogicSampler
 6
 7a = BbnNode(Variable(0, 'a', ['on', 'off']), [0.5, 0.5])
 8b = BbnNode(Variable(1, 'b', ['on', 'off']), [0.5, 0.5, 0.4, 0.6])
 9c = BbnNode(Variable(2, 'c', ['on', 'off']), [0.7, 0.3, 0.2, 0.8])
10
11bbn = Bbn() \
12    .add_node(a) \
13    .add_node(b) \
14    .add_node(c) \
15    .add_edge(Edge(a, b, EdgeType.DIRECTED)) \
16    .add_edge(Edge(b, c, EdgeType.DIRECTED))
17
18sampler = LogicSampler(bbn)
19samples = sampler.get_samples(evidence={0: 'on'}, n_samples=10000, seed=37)

Create BBN with structure and data

If you know the BBN structure and have data, you can create a BBN using the structure and learn the parameters from the data. For now, the parameters are simply the raw counts (not-Bayesian). The method to use is from Factory.from_data().

[1]:
import pandas as pd
from pybbn.graph.factory import Factory

df = pd.read_csv('./data/data-from-structure.csv')
structure = {
    'a': [],
    'b': ['a'],
    'c': ['b']
}

bbn = Factory.from_data(structure, df)

As usual, after you acquire a BBN, you can performe inference using an InferenceController.

[2]:
from pybbn.pptc.inferencecontroller import InferenceController

join_tree = InferenceController.apply(bbn)

for node, posteriors in join_tree.get_posteriors().items():
    p = ', '.join([f'{val}={prob:.5f}' for val, prob in posteriors.items()])
    print(f'{node} : {p}')
b : off=0.55020, on=0.44980
c : off=0.57210, on=0.42790
a : off=0.49850, on=0.50150
[3]:
import networkx as nx

n, d = bbn.to_nx_graph()
nx.draw(n, with_labels=True, labels=d, node_color='r', alpha=0.5)
_images/structure-data_4_0.png

Exact Inference with Widgets

Here, we show a very simple example of how to observe the marginal posterior probabilities of each node given the state of one. We will use the Huang graph [HD99].

Simulate data

[1]:
%matplotlib inline
from pybbn.graph.dag import BbnUtil
from pybbn.graph.jointree import EvidenceBuilder, EvidenceType
from pybbn.pptc.inferencecontroller import InferenceController
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from collections import namedtuple

np.random.seed(37)
plt.style.use('ggplot')
Marginal = namedtuple('Marginal', 'name, s')

def potential_to_series(p):
    vals = []
    index = []

    for pe in p.entries:
        try:
            v = pe.entries.values()[0]
        except:
            v = list(pe.entries.values())[0]
        p = pe.value

        vals.append(p)
        index.append(v)

    return pd.Series(vals, index=index)

def get_marginals(join_tree):
    data = []
    for node in join_tree.get_bbn_nodes():
        name = node.variable.name
        s = potential_to_series(join_tree.get_bbn_potential(node))
        t = Marginal(name, s)
        data.append(t)
    return data

# get the pre-defined huang graph
bbn = BbnUtil.get_huang_graph()

# convert the BBN to a join tree
join_tree = InferenceController.apply(bbn)

Visualize

[2]:
import math
from ipywidgets import interact

@interact(a=[('unobserved', -1), ('off', 0), ('on', 1)])
def f(a=-1):
    n_cols = 4
    n_rows = math.ceil(len(bbn.get_nodes()) / n_cols)

    if a == -1:
        join_tree.unobserve_all()
        marginals = get_marginals(join_tree)
    else:
        v = 'on' if a == 1 else 'off'
        ev = EvidenceBuilder() \
            .with_node(join_tree.get_bbn_node_by_name('a')) \
            .with_evidence(v, 1.0) \
            .build()
        join_tree.unobserve_all()
        join_tree.set_observation(ev)
        marginals = get_marginals(join_tree)

    marginals = sorted(marginals, key=lambda tup: tup[0])

    fig, axes = plt.subplots(n_rows, n_cols, figsize=(15, 5), sharey=True)

    for m, ax in zip(marginals, np.ravel(axes)):
        m.s.plot(kind='bar', legend=False, ax=ax)
        ax.set_title(m.name)
        ax.set_ylim([0.0, 1.0])
        ax.set_xlabel('')

    plt.tight_layout()

Multivariate Gaussian Inference with Widgets

This notebook shows how to do multivariate Gaussian inference with widgets. We allow one variable to change and visualize the change of distributions for the other. We will be using the Cowell graph [Cow98].

Simulate data

[1]:
%matplotlib inline
import numpy as np
from pybbn.gaussian.inference import GaussianInference
import matplotlib.pyplot as plt

np.random.seed(37)
plt.style.use('ggplot')
plt.rcParams['axes.grid'] = False

def get_cowell_data():
    n = 10000
    Y = np.random.normal(0, 1, n)
    X = np.random.normal(Y, 1, n)
    Z = np.random.normal(X, 1, n)

    D = np.vstack([Y, X, Z]).T
    return D, ['Y', 'X', 'Z']

def get_mvn():
    X, H = get_cowell_data()

    M = X.mean(axis=0)
    E = np.cov(X.T)

    g = GaussianInference(H, M, E)
    return g

g = get_mvn()
[2]:
import pandas as pd

pd.DataFrame(g.marginals)
[2]:
name mean var
0 Y -0.001723 0.990700
1 X 0.007448 2.016406
2 Z 0.002459 3.033838

Visualize

[3]:
from ipywidgets import interact

samples1 = g.sample_marginals(size=10000)

@interact(x=(-5, 5, 1))
def f(x=None):
    if x is not None:
        gg = g.do_inference('X', x)
    else:
        gg = g

    samples2 = gg.sample_marginals(size=5000)

    fig, axes = plt.subplots(1, 3, figsize=(15, 3), sharey=False)
    axes = np.ravel(axes)

    kind = 'hist'
    alpha = 0.15
    for (name, s2), ax in zip(samples2.items(), axes):
        if name == 'X':
            ax2 = ax.twinx()
            _ = samples1[name].plot(kind=kind, ax=ax2, color='blue', alpha=alpha)
            _ = ax.axvline(x=x, color='red')
            _ = ax2.set_ylabel('')
        else:
            ax2 = ax.twinx()
            _ = samples1[name].plot(kind=kind, ax=ax, color='blue', alpha=alpha)
            _ = s2.plot(kind=kind, ax=ax)
            _ = s2.plot(kind='kde', ax=ax2, color='green')
            _ = ax2.set_ylabel('')

        _ = ax.set_title(f'{name}')
        _ = ax.set_ylabel('')

    plt.tight_layout()

Bibliography

CGH97

E. Castillo, J.M. Gutierrez, and A.S. Hadi. Expert Systems and Probabilistic Network Models. Springer, 1997.

Cow98

R.G. Cowell. Advanced inference in bayesian networks. In M.I. Jordan, editor, Learning in Graphical Models. A Bradford Book, 1998.

Hen88

M. Henrion. Propagating uncertainty in bayesian networks by probabilistic logic sampling. Uncertainty in Artificial Intelligence, 1988.

HD99

C. Huang and A. Darwiche. Inference in belief networks: a procedural guide. International Journal of Approximate Reasoning, 1999.

IC02

J.S. Ide and F.G. Cozman. Random generation of bayesian network. Advances in Artificial Intelligence, 2002.

Kol09

D. Koller. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.

Mur12

K.P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.

Pea88

J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.

Pea00

J. Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, 2000.

Pea16

J. Pearl. Causal Inference in Statistics - A Primer. Wiley, 2016.

Pea18

J. Pearl. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.

py-bbn

Subpackages

Graph

Variable

Variable.

class pybbn.graph.variable.Variable(id, name, values)

Bases: object

A variable.

__init__(id, name, values)

Ctor.

Parameters
  • id – Numeric identifier. e.g. 0

  • name – Name. e.g. ‘a’

  • values – Array of values. e.g. [‘on’, ‘off’]

to_dict()

Gets a JSON serializable dictionary representation.

Returns

Dictionary.

Node

Nodes. There are many types: nodes, cliques, belief network nodes and separation sets.

class pybbn.graph.node.BbnNode(variable, probs)

Bases: pybbn.graph.node.Node

A BBN node.

get_weight()

Gets the weight, which is the number of values.

Returns

Weight.

to_dict()

Gets a JSON serializable dictionary representation.

Returns

Dictionary.

class pybbn.graph.node.Clique(nodes)

Bases: pybbn.graph.node.Node

A clique.

contains(id)

Checks if this clique contains the specified ID.

Parameters

id – Numeric id.

Returns

A boolean indicating if the specified id exists in this clique.

get_node_ids()

Gets the node IDs in this clique.

Returns

An array of numeric ids of the nodes in this clique.

get_sep_set(that)

Creates a separation-set from this node and the one passed in. The separation-set is composed of the intersection of the two cliques. If this node has [0, 1, 2] and the node passed in has [1, 2, 3], then the separation set will be [1, 2].

Parameters

that – Clique.

Returns

Separation-set.

get_sid()

Gets the string ID of this clique.

Returns

String ID composed of the sorted corresponding variables in each node.

get_weight()

Gets the weight of this clique; the weight is product of the weights of the nodes in this clique.

Returns

Weight.

intersects(that)

Gets intersection information.

Parameters

that – Clique.

Returns

Tuple where first item is a boolean indicating if there is any intersection, second item are the IDs in this clique, third item are the IDs of that clique and last item are IDs common to both Cliques.

is_marked()

Checks if this clique is marked.

Returns

A boolean indicating if the clique is marked.

is_superset(that)

Checks if this clique is a superset of that clique.

Parameters

that – Clique.

Returns

A boolean indicating if this clique is a superset of the clique passed in.

mark()

Marks this clique.

unmark()

Unmarks this clique.

class pybbn.graph.node.Node(id)

Bases: object

A node.

add_metadata(k, v)

Adds metadata.

Parameters
  • k – Key. Typically a string value.

  • v – Value. Any object.

class pybbn.graph.node.SepSet(left, right, lhs=None, rhs=None, intersection=None)

Bases: pybbn.graph.node.Clique

Separation-set.

property cost

Gets the cost.

Returns

The cost.

get_cost()

The cost is the sum of the weights of the cliques connected to this separation-set.

Returns

Cost.

get_mass()

The mass is the number of nodes in this separation-set.

Returns

Mass.

property is_empty

Checks if the cliques in this separation set have an empty intersection.

Returns

A boolean indicating if there is no intersection.

property mass

Gets the mass.

Returns

The mass.

Edge

Edges. There are two main types: undirected and directed. However, many other types exists as well.

class pybbn.graph.edge.Edge(i, j, type)

Bases: object

Edge.

__init__(i, j, type)

Ctor.

Parameters
  • i – Node.

  • j – Node.

  • type – Edge type.

property key

Key used for map.

Returns

Key.

class pybbn.graph.edge.EdgeType(value)

Bases: enum.Enum

Edge type.

DIRECTED = 2
UNDIRECTED = 1
class pybbn.graph.edge.JtEdge(sep_set)

Bases: pybbn.graph.edge.Edge

Junction tree edge. This is basically a hyper-edge.

__init__(sep_set)

Ctor.

Parameters

sep_set – Separation set.

get_lhs_edge()

Gets a JtEdge. e.g. left – sep_set.

Returns

JtEdge.

get_rhs_edge()

Gets a JtEdge. e.g. right – sep_set.

Returns

JtEdge.

class pybbn.graph.edge.SepSetEdge(i, j)

Bases: pybbn.graph.edge.Edge

Separation set.

__init__(i, j)

Ctor.

Parameters
  • i – Node.

  • j – Node.

Graph

Basic graphs.

class pybbn.graph.graph.Graph

Bases: object

Graph.

__init__()

Ctor.

add_edge(edge)

Adds an edge.

Parameters

edge – Edge.

Returns

This graph.

add_node(node)

Adds a node.

Parameters

node – Node.

Returns

This graph.

edge_exists(id1, id2)

Checks if the specified edge id1 – id2 exists.

Parameters
  • id1 – Node id.

  • id2 – Node id.

Returns

A boolean indicating if the specified edge exists.

get_edges()

Gets all the edges.

Returns

List of edges.

get_neighbors(id)

Gets the neighbors of the specified node.

Parameters

id – Node id.

Returns

Set of neighbors of the specified node.

get_node(id)

Gets the node associated with the specified id.

Parameters

id – Node id.

Returns

Node.

get_nodes()

Gets all the nodes.

Returns

List of nodes.

remove_node(id)

Removes a node from the graph.

Parameters

id – Node id.

class pybbn.graph.graph.Ug

Bases: pybbn.graph.graph.Graph

Undirected graph.

__init__()

Ctor.

Directed Acyclic Graph

Directed acyclic graphs.

class pybbn.graph.dag.Bbn

Bases: pybbn.graph.dag.Dag

BBN.

__init__()

Ctor.

static from_csv(path)

Converts the BBN in CSV format to a BBN. :param path: Path to CSV file. :return: BBN.

static from_dict(d)

Creates a BBN from a dictionary (deserialized JSON).

Parameters

d – Dictionary.

Returns

BBN.

static from_json(path)

Deserializes BBN from JSON.

Parameters

path – Path.

Returns

BBN.

get_parents_ordered(id)

Gets the IDs of the specified node ordered.

Parameters

id – ID of node.

Returns

List of parent IDs sorted.

static to_csv(bbn, path)

Converts the specified BBN to CSV format.

Parameters
  • bbn – BBN.

  • path – Path to file.

Returns

None.

static to_dict(bbn)

Gets a JSON serializable dictionary representation.

Parameters

bbn – BBN.

Returns

Dictionary.

static to_dne(bbn, bnet_name='network')
static to_json(bbn, path)

Serializes BBN to JSON.

Parameters
  • bbn – BBN.

  • path – Path.

Returns

None.

class pybbn.graph.dag.BbnUtil

Bases: object

BBN utility.

static get_huang_graph()

Gets the Huang reference BBN graph.

Returns

BBN.

static get_simple()

Gets a simple BBN graph.

Returns

BBN.

class pybbn.graph.dag.Dag

Bases: pybbn.graph.graph.Graph

Directed acyclic graph.

__init__()

Ctor.

edge_exists(id1, id2)

Checks if a directed edge exists between the specified id. e.g. id1 -> id2

Parameters
  • id1 – Node id.

  • id2 – Node id.

Returns

A boolean indicating if a directed edge id1 -> id2 exists.

get_children(node_id)

Gets the children IDs of the specified node.

Parameters

node_id – Node id.

Returns

Array of children ids.

get_i2n()

Gets a map of node identifiers to names.

Returns

Dictionary.

get_n2i()

Gets a map of node names to identifiers.

Returns

Dictionary.

get_parents(id)

Gets the parent IDs of the specified node.

Parameters

id – Node id.

Returns

Array of parent ids.

to_nx_graph()

Converts this DAG to a NX DiGraph for visualization.

Returns

A tuple, where the first item is the NX DiGraph and the second items are the node labels.

class pybbn.graph.dag.PathDetector(graph, start, stop)

Bases: object

Detects path between two nodes.

__init__(graph, start, stop)

Ctor.

Parameters
  • graph – DAG.

  • start – Start node id.

  • stop – Stop node id.

exists()

Checks if a path exists.

Returns

True if a path exists, otherwise, false.

Partially Directed Acylic Graph

Partially directed acylic graphs.

class pybbn.graph.pdag.PathDetector(graph, start, stop)

Bases: object

Detects path between two nodes.

__init__(graph, start, stop)

Ctor.

Parameters
  • graph – Pdag.

  • start – Start node id.

  • stop – Stop node id.

exists()

Checks if a path exists.

Returns

True if a path exists, otherwise, false.

class pybbn.graph.pdag.Pdag

Bases: pybbn.graph.graph.Graph

Partially directed acyclic graph.

__init__()

Ctor.

directed_edge_exists(id1, id2)

Checks if the specified edge id1 -> id2 exists.

Parameters
  • id1 – Node id.

  • id2 – Node id.

Returns

A boolean indicating if the edge exists.

edge_exists(id1, id2)

Checks if the specified edge id1 – id2 exists.

Parameters
  • id1 – Node id.

  • id2 – Node id.

Returns

A boolean indicating if the edge exists.

get_out_nodes(id)

Gets all the out nodes for the node with the specified id. Out nodes are all connected nodes that are not parents (do not have a directed arc into the specified node).

Parameters

id – Node id.

Returns

Array of out node ids.

get_parents(id)

Gets the parent of the specified node id.

Parameters

id – Node id.

Returns

Array of parent ids.

Join Tree

Join trees or junction trees.

class pybbn.graph.jointree.ChangeType(value)

Bases: enum.Enum

Change type.

NONE = 1
RETRACTION = 3
UPDATE = 2
class pybbn.graph.jointree.Evidence(node, type)

Bases: object

Evidence.

__init__(node, type)

Ctor.

Parameters
  • node – BBN node.

  • type – EvidenceType.

add_value(value, likelihood)

Adds a value.

Parameters
  • value – Value.

  • likelihood – Likelihood.

Returns

This evidence.

compare(potentials)

Compares this evidence with previous ones.

Parameters

potentials – Map of potentials.

Returns

The ChangeType from the comparison.

validate()

Validates this evidence.

  • virtual evidence: each likelihood must be in the range [0, 1].

  • finding evidence: all likelihoods must be exactly 1.0 or 0.0.

  • observation evidence: exactly one likelihood is 1.0 and all others must be 0.0.

class pybbn.graph.jointree.EvidenceBuilder

Bases: object

Evidence builder.

__init__()

Ctor.

build()

Builds an evidence.

Returns

Evidence.

with_evidence(val, likelihood)

Adds evidence.

Parameters
  • val – Value.

  • likelihood – Likelihood.

Returns

Builder.

with_node(node)

Adds a BBN node.

Parameters

node – BBN node.

Returns

Builder.

with_type(type)

Adds the EvidenceType.

Parameters

type – EvidenceType.

Returns

Builder.

class pybbn.graph.jointree.EvidenceType(value)

Bases: enum.Enum

Evidence type.

FINDING = 2
OBSERVATION = 3
UNOBSERVE = 4
VIRTUAL = 1
class pybbn.graph.jointree.JoinTree

Bases: pybbn.graph.graph.Ug

Join tree.

__init__()

Ctor.

add_edge(edge)

Adds an JtEdge.

Parameters

edge – JtEdge.

Returns

This join tree.

add_potential(clique, potential)

Adds a potential associated with the specified clique.

Parameters
  • clique – Clique.

  • potential – Potential.

Returns

This join tree.

find_cliques_with_node_and_parents(id)

Finds all cliques in this junction tree having the specified node and its parents.

Parameters

id – Node id.

Returns

Array of cliques.

static from_dict(d)

Converts a dictionary to a junction tree.

Parameters

d – Dictionary.

Returns

Junction tree.

get_bbn_node(id)

Gets the BBN node associated with the specified id.

Parameters

id – Node id.

Returns

BBN node or None if no such node exists.

get_bbn_node_and_parents()

Gets a map of nodes and its parents.

Returns

Map. Keys are node ID and values are list of nodes.

get_bbn_node_by_name(name)

Gets the BBN node associated with the specified name.

Parameters

name – Node name.

Returns

BBN node or None if no such node exists.

get_bbn_nodes()

Gets all the BBN nodes in this junction tree.

Returns

List of BBN nodes.

get_bbn_potential(node)

Gets the potential associated with the specified BBN node.

Parameters

node – BBN node.

Returns

Potential.

get_change_type(evidences)

Gets the change type associated with the specified list of evidences.

Parameters

evidences – List of evidences.

Returns

ChangeType.

get_cliques()

Gets all the cliques in this junction tree.

Returns

Array of cliques.

get_evidence(node, value)

Gets the evidence associated with the specified BBN node and value.

Parameters
  • node – BBN node.

  • value – Value.

Returns

Potential (the evidence).

get_flattened_edges()

Gets all the edges “flattened” out. Since separation-sets are really hyper-edges, this method breaks separation-sets into two edges.

Returns

Array of edges.

get_posteriors()

Gets the posterior for all nodes.

Returns

Map. Keys are node names; values are map of node values to posterior probabilities.

get_sep_sets()

Gets all the separation sets in this junction tree.

Returns

Array of separation sets.

get_unobserved_evidence(node)

Gets the unobserved evidences associated with the specified node.

Parameters

node – BBN node.

Returns

Evidence.

set_listener(listener)

Sets the listener.

Parameters

listener – JoinTreeListener.

set_observation(evidence)

Sets a single observation.

Parameters

evidence – Evidence.

Returns

This join tree.

static to_dict(jt, bbn)

Converts a junction tree to a serializable dictionary.

Parameters
  • jt – Junction tree.

  • bbn – BBN.

Returns

Dictionary.

unmark_cliques()

Unmarks the cliques.

unobserve(nodes)

Unobserves a list of nodes.

Parameters

nodes – List of nodes.

Returns

This join tree.

unobserve_all()

Unobserves all BBN nodes.

Returns

This join tree.

update_bbn_cpts(cpts)

Updates the CPTs of the BBN nodes.

Parameters

cpts – Dictionary of CPTs. Keys are ids of BBN node and values are new CPTs.

Returns

None

update_evidences(evidences)

Updates this join tree with the list of specified evidence.

Parameters

evidences – List of evidences.

Returns

This join tree.

class pybbn.graph.jointree.JoinTreeListener

Bases: object

Interface like class used for listening to a join tree.

evidence_retracted(join_tree)

Evidence is retracted.

Parameters

join_tree – Join tree.

evidence_updated(join_tree)

Evidence is updated.

Parameters

join_tree – Join tree.

class pybbn.graph.jointree.PathDetector(graph, start, stop)

Bases: object

Detects path between two nodes.

__init__(graph, start, stop)

Ctor.

Parameters
  • graph – Join tree.

  • start – Start node id.

  • stop – Stop node id.

exists()

Checks if a path exists.

Returns

True if a path exists, otherwise, false.

Factory

Factories.

class pybbn.graph.factory.Factory

Bases: object

Factory to convert other API BBNs into py-bbn.

static from_data(structure, df)

Creates a BBN.

Parameters
  • structure – A dictionary where keys are names of children and values are list of parent names.

  • df – A dataframe.

Returns

BBN.

static from_libpgm_discrete_dictionary(d)

Converts a libpgm discrete network as specified by a dictionary into a py-bbn one. Look at https://pythonhosted.org/libpgm/unittestdict.html.

Parameters

d – A dictionary representing a libpgm discrete network.

Returns

py-bbn BBN.

static from_libpgm_discrete_json(j)

Converts a libpgm discrete network as specified by a JSON string into a py-bbn one. Look at https://pythonhosted.org/libpgm/unittestdict.html.

Parameters

j – String representing JSON.

Returns

py-bbn BBN.

static from_libpgm_discrete_object(bn)

Converts a libpgm discrete network object into a py-bbn one.

Parameters

bn – libpgm discrete BBN.

Returns

py-bbn BBN.

Potential

Potentials.

class pybbn.graph.potential.Potential

Bases: object

Potential.

__init__()

Ctor.

add_entry(entry)

Adds a PotentialEntry.

Parameters

entry – PotentialEntry.

Returns

This potential.

get_matching_entries(entry)

Gets all potential entries matching the specified entry.

Parameters

entry – PotentialEntry.

Returns

Array of matching potential entries.

static to_dict(potentials)

Converts potential to dictionary for easy validation.

Parameters

potentials – Potential.

Returns

Dictionary representation. Keys are entries and values are probabilities.

class pybbn.graph.potential.PotentialEntry

Bases: object

Potential entry.

__init__()

Ctor.

add(k, v)

Adds a node id and its value.

Parameters
  • k – Node id.

  • v – Value.

Returns

This potential entry.

duplicate()

Duplicates this entry.

Returns

PotentialEntry.

get_entry_keys()

Gets entry keys sorted.

Returns

List of tuples. First tuple is id of variable and second tuple is value of variable.

get_kv()

Gets key-value pair that may be used for storage in dictionary.

Returns

Key-value pair.

matches(that)

Checks if this potential entry matches the specified one. A match is determined with all the keys and their associated values in the potential entry passed in matches this one.

Parameters

that – PotentialEntry.

Returns

class pybbn.graph.potential.PotentialUtil

Bases: object

Potential util.

static divide(numerator, denominator)

Divides two potentials.

Parameters
  • numerator – Potential.

  • denominator – Potential.

Returns

Potential.

static get_cartesian_product(lists)

Gets the cartesian product of a list of lists of values. For example, if the list is

  • [ [‘on’, ‘off’], [‘on’, ‘off’] ]

then the result will be a list of the following

  • [ ‘on’, ‘on’]

  • [ ‘on’, ‘off’ ]

  • [ ‘off’, ‘on’ ]

  • [ ‘off’, ‘off’ ]

Parameters

lists – List of list of values.

Returns

Cartesian product of values.

static get_potential(node, parents)

Gets the potential associated with the specified node and its parents.

Parameters
  • node – BBN node.

  • parents – Parents of the BBN node (that themselves are also BBN nodes).

Returns

Potential.

static get_potential_from_nodes(nodes)

Gets a potential from a list of BBN nodes.

Parameters

nodes – Array of BBN nodes.

Returns

Potential.

static is_zero(d)

Checks if the specified value is 0.0.

Parameters

d – Value.

Returns

A boolean indicating if the value is zero.

static marginalize_for(join_tree, clique, nodes)

Marginalizes the specified clique’s potential over the specified nodes.

Parameters
  • join_tree – Join tree.

  • clique – Clique.

  • nodes – List of BBN nodes.

Returns

Potential.

static merge(node, parents)

Merges the nodes into one array.

Parameters
  • node – BBN node.

  • parents – BBN parent nodes.

Returns

Array of BBN nodes.

static multiply(bigger, smaller)

Multiplies two potentials. Order matters.

Parameters
  • bigger – Bigger potential.

  • smaller – Smaller potential.

static normalize(potential)

Normalizes the potential (make sure they sum to 1.0).

Parameters

potential – Potential.

Returns

Potential.

static pass_single_message(join_tree, x, s, y)

Single message pass from x – s – y (from x to s to y).

Parameters
  • join_tree – Join tree.

  • x – Clique.

  • s – Separation-set.

  • y – Clique.

Utilities

Utilities to make life easier.

class pybbn.graph.util.IdUtil

Bases: object

ID util.

static hash_string(s)

Hashes the string.

Parameters

s – String.

Returns

Hash value.

Junction Tree Algorithm

Inference Control

Used in controlling exact inference.

class pybbn.pptc.inferencecontroller.InferenceController

Bases: pybbn.graph.jointree.JoinTreeListener

Inference controller.

static apply(bbn)

Sets up the specified BBN for probability propagation in tree clusters (PPTC).

Parameters

bbn – BBN graph.

Returns

Join tree.

static apply_from_serde(join_tree)

Applies propagation to join tree from a deserialzed join tree.

Parameters

join_tree – Join tree.

Returns

Join tree (the same one passed in).

evidence_retracted(join_tree)

Evidence is retracted.

Parameters

join_tree – Join tree.

evidence_updated(join_tree)

Evidence is updated.

Parameters

join_tree – Join tree.

static reapply(join_tree, cpts)

Reapply propagation to join tree with new CPTs. The join tree structure is kept but the BBN node CPTs are updated. A new instance/copy of the join tree will be returned.

Parameters
  • join_tree – Join tree.

  • cpts – Dictionary of new CPTs. Keys are id’s of nodes and values are new CPTs.

Returns

Join tree.

Potential Initialization

Used to initialize potentials.

class pybbn.pptc.potentialinitializer.PotentialInitializer

Bases: object

Potential initializer.

static init(bbn)

Initializes the BBN potentials.

Parameters

bbn – BBN graph.

static reinit(jt)

Reinitialize potentials of BBN nodes in join tree.

Parameters

jt – Join tree.

Returns

None.

Moralization

Moralization of a directed acyclic graph.

class pybbn.pptc.moralizer.Moralizer

Bases: object

Graph moralizer for a DAG.

static moralize(dag)

Moralizes a DAG.

Parameters

dag – DAG.

Returns

Moralized (undirected) graph.

Triangulation

Triangulates a moralized graph.

class pybbn.pptc.triangulator.NodeClique(node, neighbors, weight, edges)

Bases: object

Node clique.

__init__(node, neighbors, weight, edges)

Ctor.

Parameters
  • node – BBN node.

  • neighbors – BBN nodes (neighbors).

  • weight – Weight.

  • edges – Edges.

get_bbn_nodes()

Gets all the BBN nodes in this node clique.

Returns

Array of BBN nodes.

class pybbn.pptc.triangulator.Triangulator

Bases: object

Triangulator. Triangulates an undirected moralized graph and produces cliques in the process.

static duplicate(g)

Duplicates a undirected graph.

Parameters

g – Undirected graph.

Returns

Undirected graph.

static generate_cliques(m)

Generates a list of node cliques.

Parameters

m – Graph.

Returns

List of NodeCliques.

static get_edges_to_add(n, m)

Gets edges to add.

Parameters
  • n – BBN node.

  • m – Graph.

Returns

Array of edges.

static get_weight(n, m)

Gets the weight of a BBN node. The weight of a node is the product of the its weight with all its neighbors’ weight.

Parameters
  • n – BBN node.

  • m – Graph.

Returns

Weight.

static is_subset(cliques, clique)

Checks if the specified clique is a subset of the specified list of cliques.

Parameters
  • cliques – List of cliques.

  • clique – Clique.

Returns

A boolean indicating if the clique is a subset.

static select_node(m)

Selects a clique from the specified graph. Cliques are sorted by number of edges, weight, and id (asc).

Parameters

m – Graph.

Returns

Clique.

static triangulate(m)

Triangulates the specified moralized graph.

Parameters

m – Moralized undirected graph.

Returns

Array of cliques.

Transformation

Transforms the cliques found from triangulation into a junction tree.

class pybbn.pptc.transformer.Transformer

Bases: object

Transformer. Transforms a list of cliques into a join tree.

static get_sep_sets(cliques)

Gets all pair-wise separation-sets.

Parameters

cliques – Array of cliques.

Returns

Array of separation sets sorted descendingly by mass followed by cost (asc) and id (asc).

static transform(cliques)

Transforms the cliques into a join tree.

Parameters

cliques – List of cliques.

Returns

Join tree.

Initialization

Initializes a junction tree.

class pybbn.pptc.initializer.Initializer

Bases: object

Initializes the join tree.

static get_clique(node, join_tree)

Gets the parent clique associated with the specified BBN node.

Parameters
  • node – BBN node.

  • join_tree – Join tree.

Returns

Parent clique.

static initialize(join_tree)

Starts the initialization.

Parameters

join_tree – Join tree.

Returns

Join tree.

Propagation

Propagates evidences in a junction tree.

class pybbn.pptc.propagator.Propagator

Bases: object

Evidence propagator.

static collect_evidence(join_tree, start)

Collects evidence.

Parameters
  • join_tree – Join tree.

  • start – Start clique.

static distribute_evidence(join_tree, start)

Distributes evidence.

Parameters
  • join_tree – Join tree.

  • start – Start clique.

static propagate(join_tree)

Propagates evidence.

Parameters

join_tree – Join tree.

Returns

Join tree.

Evidence Distribution

Distributes evidences.

class pybbn.pptc.evidencedistributor.EvidenceDistributor(join_tree, start_clique)

Bases: object

Evidence distributor. Passes messages using breadth-first-search (BFS). Messages are passed from the start clique to the far remote cliques.

__init__(join_tree, start_clique)

Ctor.

Parameters
  • join_tree – Join tree.

  • start_clique – Start clique.

start()

Starts the evidence distribution.

Evidence Collection

Collects evidences.

class pybbn.pptc.evidencecollector.EvidenceCollector(join_tree, start_clique)

Bases: object

Evidence collector. Passes messages using depth-first-search (DFS). Messages are passed from the far remote cliques back to the start clique.

__init__(join_tree, start_clique)

Ctor.

Parameters
  • join_tree – Join tree.

  • start_clique – Start clique.

start()

Starts the evidence collection.

Sampling

Use this module for sampling.

class pybbn.sampling.sampling.LogicSampler(bbn)

Bases: object

Logic sampling with rejection.

__init__(bbn)

Ctor.

Parameters

bbn – BBN.

get_samples(evidence={}, n_samples=100, seed=37)

Gets the samples.

Parameters
  • evidence – Evidence. Dictionary. Keys are ids and values are node values.

  • n_samples – Number of samples.

  • seed – Seed (default=37).

Returns

Samples.

class pybbn.sampling.sampling.SortableNode(node_id, parent_ids)

Bases: object

Sortable node.

__init__(node_id, parent_ids)

Ctor.

Parameters
  • node_id – Node ID.

  • parent_ids – List of parent IDs.

class pybbn.sampling.sampling.Table(node, parents=[])

Bases: object

Table association parent instantiations with cumulative distributions of node values.

__init__(node, parents=[])

Ctor.

Parameters
  • node – BBN node.

  • parents – List of parent BBN nodes.

get_value(prob, sample=None)

Gets the value associated with the specified probability.

Parameters
  • prob – Probability.

  • sample – Dictionary of variable-value sampled so far.

Returns

Value.

has_parents()

Checks if the node associated with this table has parents.

Returns

Boolean.

Generator

Used this package to create realistic Bayesian belief networks.

pybbn.generator.bbngenerator.convert_for_drawing(bbn)

Converts a BBN to a networkx graph for drawing.

Parameters

bbn – BBN.

Returns

Directed acyclic graph.

pybbn.generator.bbngenerator.convert_for_exact_inference(g, p)

Converts the graph and parameters to a BBN.

Parameters
  • g – Directed acyclic graph (DAG in the form of networkx).

  • p – Parameters.

Returns

BBN.

pybbn.generator.bbngenerator.generate_bbn_to_file(n, file_path, bbn_type='singly', max_iter=10, max_values=2, max_alpha=10)

Generates a BBN and saves it to a file.

Parameters
  • n – Number of nodes.

  • file_path – File path. JSON and CSV supported. Export will be determined by path extension.

  • bbn_type – Type: singly or multi.

  • max_iter – Maximum iterations.

  • max_values – Maximum values.

  • max_alpha – Maximum alpha.

Returns

None.

pybbn.generator.bbngenerator.generate_multi_bbn(n, max_iter=10, max_values=2, max_alpha=10)

Generates structure and parameters for a multi-connected BBN.

Parameters
  • n – Number of nodes.

  • max_iter – Maximum iterations.

  • max_values – Maximum values per node.

  • max_alpha – Maximum alpha per value (hyperparameters).

Returns

A tuple of structure and parameters.

pybbn.generator.bbngenerator.generate_singly_bbn(n, max_iter=10, max_values=2, max_alpha=10)

Generates structure and parameters for a singly-connected BBN.

Parameters
  • n – Number of nodes.

  • max_iter – Maximum iterations.

  • max_values – Maximum values per node.

  • max_alpha – Maximum alpha per value (hyperparameters).

Returns

A tuple of structure and parameters.

pybbn.generator.bbngenerator.to_json(g, params, pretty=False)

Serializes the graph to JSON.

Parameters
  • g – Graph.

  • params – Parameters.

  • pretty – Pretty-print serialization flag.

Returns

None.

Causality

Average Causal Effect

Use this package to compute the Average Causal Effect.

class pybbn.causality.ace.Ace(bbn)

Bases: object

Estimates average causal effect (ACE).

__init__(bbn)

ctor

Parameters

bbn – Bayesian belief network.

get_ace(x, y, y_val)

Computes the ACE of X on Y.

Parameters
  • x – X name.

  • y – Y name.

  • y_val – Y value.

Returns

Dictionary of ACE over X values.

Gaussian Package

Inference

Use this module to do inference in Gaussian Bayesian Belief Networks.

class pybbn.gaussian.inference.GaussianInference(H, M, E, meta={})

Bases: object

Gaussian inference.

property P

Gets the univariate parameters of each variable.

Returns

Dictionary. Keys are variable names. Values are tuples of (mean, variance).

__init__(H, M, E, meta={})

ctor.

Parameters
  • H – Headers.

  • M – Means.

  • E – Covariance matrix.

  • meta – Dictionary storing observations.

do_inference(name, observation)

Performs inference. Simply calls the do_inferences method.

Parameters
  • name – Name of variable.

  • observation – Observation value.

Returns

GaussianInference.

do_inferences(observations)

Performs inference.

Denote the following.

  • \(z\) as the variable observed

  • \(y\) as the set of other variables

  • \(\mu\) as the vector of means
    • \(\mu_z\) as the partitioned \(\mu\) of length \(|z|\)

    • \(\mu_y\) as the partitioned \(\mu\) of length \(|y|\)

  • \(\Sigma\) as the covariance matrix
    • \(\Sigma_{yz}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|z|\) columns

    • \(\Sigma_{zz}\) as the partitioned \(\Sigma\) of \(|z|\) rows and \(|z|\) columns

    • \(\Sigma_{yy}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|y|\) columns

If we observe evidence \(z_e\), then the new means \(\mu_y^{*}\) and covariance matrix \(\Sigma_y^{*}\) corresponding to \(y\) are computed as follows.

  • \(\mu_y^{*} = \mu_y - \Sigma_{yz} \Sigma_{zz} (z_e - \mu_z)\)

  • \(\Sigma_y^{*} = \Sigma_{yy} \Sigma_{zz} \Sigma_{yz}^{T}\)

Parameters

observations – List of observation. Each observation is tuple (name, value).

Returns

GaussianInference.

property marginals

Gets the marginals.

Returns

List of dictionary. Each element has name, mean and variance.

sample_marginals(size=1000)

Samples data from the marginals.

Parameters

size – Number of samples.

Returns

Dictionary with keys as names and values as pandas series (sampled data).

Indices and tables

Citation

@misc{vang_2017,
title={PyBBN},
url={https://github.com/vangj/py-bbn/},
author={Vang, Jee},
year={2017},
month={Jan}}

Author

Jee Vang, Ph.D.

  • Patreon: support is appreciated

  • GitHub: sponsorship will help us change the world for the better

Help