causal_networkx.PAG#
- class causal_networkx.PAG(incoming_graph_data=None, incoming_latent_data=None, incoming_uncertain_data=None, incoming_selection_data=None, **attr)[source]#
Partial ancestral graph (PAG).
An equivalence class of MAGs, which represents an equivalence class of causal DAGs.
- Parameters:
incoming_graph_data :
input
graph (optional, default:None
)Data to initialize directed edge graph. The edges in this graph represent directed edges between observed variables, which are represented using a
networkx.DiGraph
, so accepts any arguments from thenetworkx.DiGraph
class. There must be no cycles in this graph structure.incoming_latent_data :
input
graph (optional, default:None
)Data to initialize bidirected edge graph. The edges in this graph represent bidirected edges, which are represented using a
networkx.Graph
, so accepts any arguments from thenetworkx.Graph
class.incoming_uncertain_data :
input
graph (optional, default:None
)Data to initialize circle endpoints on the graph. The edges in this graph represent circle endpoints, which are represented using a
networkx.DiGraph
. This does not necessarily need to be acyclic, since there are circle endpoints possibly in both directions.incoming_selection_bias :
input
graph (optional, default:None
)Data to initialize selection bias graph. Currently, not used or implemented.
attr : keyword arguments, optional (default= no attributes)
Attributes to add to graph as key=value pairs.
See also
Notes
In PAGs, there is only one edge between any two nodes, but there are different types of edges. The entire PAG is represented using multiple
networkx
graphs. Together, these graphs are joined together to form an efficient representation of the PAG.directed edges (->, <-, indicating causal relationship):
networkx.DiGraph
bidirected edges (<->, indicating latent confounder):
networkx.DiGraph
circular endpoints (-o, o-, indicating uncertainty in edge type):
networkx.DiGraph
undirected edges (-, indicating selection bias):
networkx.Graph
. Currently not implemented or used.
Note that the circles are not “edges”, but simply endpoints since they can represent a variety of different possible endpoints, such as “arrow”, or “tail”, which then constitute a variety of different types of possible edges.
Compared to causal graphs, PAGs differ in terms of how parents and children are defined. In causal graphs, there are only two types of edges, either a directed arrow, or bidirected arrow. In PAGs, there are directed arrows with either a circle, or tail on the other end: e.g. ‘x’ o-> ‘y’. This now introduces “possible” parents/children denoted with a circle edge on the other end of the arrow and definite parents/children with only an arrow edge. See
possible_parents
,possible_children
and their counterpartsparents
,children
.Since PAGs only allow “one edge” between any two nodes, adding edges and removing edges have different semantics. See more in
add_edge
,remove_edge
.- Attributes:
-
Directed edges.
Generate confounded components of the graph.
Return all circle edges.
Directed edges.
Name as a string identifier of the graph.
Return the nodes within the DAG.
Methods
add_bidirected_edge
(u_of_edge, v_of_edge, **attr)Override adding bidirected edge with check on the PAG.
add_bidirected_edges_from
(ebunch, **attr)Override adding bidirected edges with check on the PAG.
add_chain
(node_chain)Add a causal chain.
add_circle_endpoint
(u_of_edge, v_of_edge[, ...])Add a circle endpoint between u and v (will add an edge if no previous edge).
add_circle_endpoints_from
(ebunch_to_add)Add all the edges in ebunch_to_add.
add_edge
(u_of_edge, v_of_edge, **attr)Override adding edge with check on the PAG.
add_edges_from
(ebunch, **attr)Override adding multiple edges with check on the PAG.
add_node
(node_for_adding, **attr)Add node to causal graph.
add_nodes_from
(nodes_for_adding, **attr)Add nodes to causal graph.
adjacencies
(u)Get all adjacent nodes to u.
Get dictionary of all the edges by edge type.
ancestors
(source)Ancestors of 'source' node with directed path.
children
(n)Return the definite children of node 'n' in a PAG.
clear
()Remove all nodes and edges in graphs.
Remove all edges from causal graph without removing nodes.
compute_full_graph
([to_networkx])Compute the full graph from a PAG.
copy
()Return a copy of the causal graph.
degree
(n)Compute the degree of the DiGraph.
descendants
(source)Descendants of 'source' node with directed path.
do
(nodes)Apply a do-intervention on nodes to causal graph.
draw
()Draw the graph.
Sample an empty dataframe with columns as the nodes.
edge_subgraph
(edges)Create a causal subgraph of just certain edges.
edge_type
(u, v)Return the edge type associated between u and v.
get_edge_data
(u, v[, default])Get edge data from underlying DiGraph.
has_adjacency
(u, v)Check if there is any edge between u and v.
has_bidirected_edge
(u, v)Check if graph has bidirected edge (u, v).
has_circle_endpoint
(u, v)Check if graph has circle endpoint from u to v (
u *-o v
).has_edge
(u, v)Check if graph has edge (u, v).
has_node
(n)Check if graph has node 'n'.
In degree view of DAG.
Check if graph is acyclic.
is_def_collider
(node1, node2, node3)Check if <node1, node2, node3> path forms a definite collider.
is_def_noncollider
(node1, node2, node3)Check if <node1, node2, node3> path forms a definite non-collider.
is_edge_visible
(u, v)Check if edge (u, v) is visible, or not.
is_node_common_cause
(node[, exclude_nodes])Check if a node is a common cause within the graph.
is_unshielded_collider
(a, b, c)Check if unshielded collider.
markov_blanket_of
(node)Compute the markov blanket of a node.
neighbors
(node)Neighbors view of DAG.
number_of_bidirected_edges
([u, v])Return number of bidirected edges in graph.
number_of_circle_endpoints
([u, v])Return number of circle endpoints in graph.
Return number of edges in graph.
Return number of nodes in graph.
order
()Return the order of the DiGraph.
orient_circle_endpoint
(u, v, endpoint)Orient circle endpoint into an arrowhead, or tail.
Out degree view of DAG.
parents
(n)Return the definite parents of node 'n' in a PAG.
Possible c-components.
Return the possible children of node 'n' in a PAG.
Return the possible parents of node 'n' in a PAG.
predecessors
(u)Return predecessors of node u.
print_edge
(u, v)Representation of edge between u and v as string.
relabel_nodes
(mapping[, copy])Relabel the nodes of the graph G according to a given mapping.
remove_bidirected_edge
(u_of_edge, v_of_edge)Remove a bidirected edge between u and v.
remove_circle_endpoint
(u, v[, bidirected])Remove circle endpoint from graph.
remove_edge
(u, v)Remove directed edge.
remove_edges_from
(ebunch)Remove directed edges.
remove_node
(n)Remove node in causal graphs.
remove_nodes_from
(ebunch)Remove nodes from causal graph.
sample
([n])Sample from a graph.
Set nodes as latent unobserved confounders.
size
([weight])Return the total number of edges possibly with weights.
soft_do
(nodes[, dependencies])Apply a soft-intervention on node to causal graph.
spouses
(node)Get other parents of the children of a node (spouses).
subgraph
(nodes)Create a causal subgraph of just certain nodes.
successors
(u)Return successors of node u.
Compute an adjacency undirected graph.
to_dot_graph
([to_dagitty])Convert to 'dot' graph representation as a string.
Converts causal graphs to networkx.
to_numpy
()Convert to a matrix representation.
tomag
()Convert corresponding causal DAG to a MAG.
is_directed
is_multigraph
save
- __contains__(n)#
Return True if n is a node, False otherwise. Use: ‘n in G’.
Examples
>>> G = nx.path_graph(4) # or DiGraph, MultiGraph, MultiDiGraph, etc >>> 1 in G True
- add_bidirected_edge(u_of_edge, v_of_edge, **attr)[source]#
Override adding bidirected edge with check on the PAG.
- Return type:
- add_bidirected_edges_from(ebunch, **attr)[source]#
Override adding bidirected edges with check on the PAG.
- add_chain(node_chain)#
Add a causal chain.
- add_circle_endpoint(u_of_edge, v_of_edge, bidirected=False)[source]#
Add a circle endpoint between u and v (will add an edge if no previous edge).
The nodes u and v will be automatically added if they are not already in the graph.
- Parameters:
u_of_edge, v_of_edge : nodes
Nodes can be, for example, strings or numbers. Nodes must be hashable (and not None) Python objects.
bidirected : bool
Whether or not to also add an uncertain endpoint from
v_of_edge
tou_of_edge
.
See also
add_edges_from
add a collection of edges
add_edge
- add_circle_endpoints_from(ebunch_to_add)[source]#
Add all the edges in ebunch_to_add.
If you want to add bidirected circle edges, you must pass in both (A, B) and (B, A).
- Parameters:
ebunch_to_add : container of
edges
Each edge given in the container will be added to the graph. The edges must be given as 2-tuples (u, v) or 3-tuples (u, v, d) where d is a dictionary containing edge data.
See also
add_edge
add a single edge
add_circle_endpoint
convenient way to add uncertain edges
Notes
Adding the same edge twice has no effect but any edge data will be updated when each duplicate edge is added.
Examples
>>> G = nx.Graph() # or DiGraph, MultiGraph, MultiDiGraph, etc >>> G.add_edges_from([(0, 1), (1, 2)]) # using a list of edge tuples >>> e = zip(range(0, 3), range(1, 4)) >>> G.add_edges_from(e) # Add the path graph 0-1-2-3
Associate data to edges
>>> G.add_edges_from([(1, 2), (2, 3)], weight=3) >>> G.add_edges_from([(3, 4), (1, 4)], label="WN2898")
- add_node(node_for_adding, **attr)#
Add node to causal graph.
- add_nodes_from(nodes_for_adding, **attr)#
Add nodes to causal graph.
- adjacencies(u)#
Get all adjacent nodes to u.
Adjacencies are defined as any type of edge to node ‘u’.
- all_edges()#
Get dictionary of all the edges by edge type.
- ancestors(source)#
Ancestors of ‘source’ node with directed path.
- property bidirected_edges#
Directed edges.
- property c_components: List[Set]#
Generate confounded components of the graph.
TODO: Improve runtime since this iterates over a list twice.
- children(n)[source]#
Return the definite children of node ‘n’ in a PAG.
Definite children are children of node ‘n’ with only a directed edge between them from ‘n’ -> ‘x’. For example, ‘n’ o-> ‘x’ does not qualify ‘x’ as a children of ‘n’.
- Parameters:
n : node
A node in the causal DAG.
- Yields:
children :
Iterator
An iterator of the children of node ‘n’.
See also
- property circle_endpoints#
Return all circle edges.
- clear()#
Remove all nodes and edges in graphs.
- clear_edges()#
Remove all edges from causal graph without removing nodes.
Clears edges in the DiGraph and the bidirected undirected graph.
- compute_full_graph(to_networkx=False)[source]#
Compute the full graph from a PAG.
Adds bidirected edges as latent confounders. Also adds circle edges as latent confounders and either:
an unobserved mediatior
or an unobserved common effect
The unobserved commone effect will be always conditioned on to preserve our notion of m-separation in PAGs.
- Parameters:
to_networkx : bool, optional
Whether to return the graph as a DAG DiGraph, by default False.
- Returns:
_full_graph :
PAG
|nx.DiGraph
The full directed DAG.
- copy()#
Return a copy of the causal graph.
- degree(n)#
Compute the degree of the DiGraph.
- descendants(source)#
Descendants of ‘source’ node with directed path.
- do(nodes)#
Apply a do-intervention on nodes to causal graph.
- dummy_sample()#
Sample an empty dataframe with columns as the nodes.
Used for oracle testing.
- edge_subgraph(edges)#
Create a causal subgraph of just certain edges.
- property edges#
Directed edges.
- get_edge_data(u, v, default=None)#
Get edge data from underlying DiGraph.
- has_adjacency(u, v)#
Check if there is any edge between u and v.
- has_bidirected_edge(u, v)#
Check if graph has bidirected edge (u, v).
- has_edge(u, v)#
Check if graph has edge (u, v).
- has_node(n)#
Check if graph has node ‘n’.
- in_degree()#
In degree view of DAG.
- is_acyclic()#
Check if graph is acyclic.
- is_def_collider(node1, node2, node3)[source]#
Check if <node1, node2, node3> path forms a definite collider.
I.e. node1 -> node2 <- node3.
- Parameters:
node1 : node
A node on the path to check.
node2 : node
A node on the path to check.
node3 : node
A node on the path to check.
- Returns:
is_collider : bool
Whether or not the path is a definite collider.
- is_def_noncollider(node1, node2, node3)[source]#
Check if <node1, node2, node3> path forms a definite non-collider.
I.e. node1 - node2 -> node3, or node1 <- node2 - node3
- Parameters:
node1 : node
A node on the path to check.
node2 : node
A node on the path to check.
node3 : node
A node on the path to check.
- Returns:
is_noncollider : bool
Whether or not the path is a definite non-collider.
- is_node_common_cause(node, exclude_nodes=None)#
Check if a node is a common cause within the graph.
- is_unshielded_collider(a, b, c)#
Check if unshielded collider.
- markov_blanket_of(node)#
Compute the markov blanket of a node.
When computing the Markov blanket for an ADMG, we can use the definition presented in [1], where the Markov blanket is a subset,
S
of variables in the graph, where a subset,S'
is called a Markov blanket if it satisfies the condition:\[X \perp S | S'\]- Parameters:
node : node
The node to compute Markov blanket for.
- Returns:
markov_blanket :
set
A set of parents, children and spouses of the node.
References
- Return type:
- property name#
Name as a string identifier of the graph.
This graph attribute appears in the attribute dict G.graph keyed by the string “name”. as well as an attribute (technically a property)
G.name
. This is entirely user controlled.
- neighbors(node)#
Neighbors view of DAG.
- property nodes#
Return the nodes within the DAG.
Ignores the c-component graph nodes.
- number_of_bidirected_edges(u=None, v=None)#
Return number of bidirected edges in graph.
- number_of_edges()#
Return number of edges in graph.
- number_of_nodes()#
Return number of nodes in graph.
- order()#
Return the order of the DiGraph.
- orient_circle_endpoint(u, v, endpoint)[source]#
Orient circle endpoint into an arrowhead, or tail.
- Parameters:
u : node
The parent node
v : node
The node that ‘u’ points to in the graph.
endpoint :
str
An edge type as specified in
EndPoint
(‘arrow’, ‘tail’)- Raises:
-
If ‘endpoint’ is not in the
EndPoint
enumeration.
- out_degree()#
Out degree view of DAG.
- parents(n)[source]#
Return the definite parents of node ‘n’ in a PAG.
Definite parents are parents of node ‘n’ with only a directed edge between them from ‘n’ <- ‘x’. For example, ‘n’ <-o ‘x’ does not qualify ‘x’ as a parent of ‘n’.
- Parameters:
n : node
A node in the causal DAG.
- Yields:
parents :
Iterator
An iterator of the definite parents of node ‘n’.
See also
- possible_children(n)[source]#
Return the possible children of node ‘n’ in a PAG.
Possible children of ‘n’ are nodes with an edge like ‘n’ o-> ‘x’. Nodes with ‘n’ o-o ‘x’ are not considered possible children.
- Parameters:
n : node
A node in the causal DAG.
- Returns:
children :
Iterator
An iterator of the children of node ‘n’.
See also
- possible_parents(n)[source]#
Return the possible parents of node ‘n’ in a PAG.
Possible parents of ‘n’ are nodes with an edge like ‘n’ <-o ‘x’. Nodes with ‘n’ o-o ‘x’ are not considered possible parents.
- Parameters:
n : node
A node in the PAG.
- Returns:
parents :
Iterator
An iterator of the parents of node ‘n’.
See also
- predecessors(u)#
Return predecessors of node u.
A predecessor is defined as nodes with a directed edge to ‘u’. That is ‘v’ -> ‘u’. A bidirected edge would not qualify as a predecessor.
- print_edge(u, v)[source]#
Representation of edge between u and v as string.
- Parameters:
u : node
Node in graph.
v : node
Node in graph.
- Returns:
return_str :
str
The type of edge between the two nodes.
- relabel_nodes(mapping, copy=True)#
Relabel the nodes of the graph G according to a given mapping.
- Parameters:
mapping :
dict
A dictionary with the old labels as keys and new labels as values. A partial mapping is allowed. Mapping 2 nodes to a single node is allowed. Any non-node keys in the mapping are ignored.
copy : bool (optional, default=True)
If True return a copy, or if False relabel the nodes in place.
- Returns:
G : instance of causal
DAG
A copy (if copy is True) of the relabeled graph.
- remove_bidirected_edge(u_of_edge, v_of_edge, remove_isolate=True)#
Remove a bidirected edge between u and v.
The nodes u and v will be automatically added if they are not already in the graph.
- Parameters:
u_of_edge, v_of_edge : nodes
Nodes can be, for example, strings or numbers. Nodes must be hashable (and not None) Python objects.
remove_isolate : bool
Whether or not to remove isolated nodes after the removal of the bidirected edge. Default is True.
See also
networkx.MultiDiGraph.add_edges_from
add a collection of edges
networkx.MultiDiGraph.add_edge
add an edge
Notes
…
- Return type:
- remove_circle_endpoint(u, v, bidirected=False)[source]#
Remove circle endpoint from graph.
Removes the endpoint
u *-o v
from the graph and orients it asu *- v
.- Parameters:
u : node
The start node.
v : node
The ending node.
bidirected : bool, optional
Whether to also remove the endpoint from v to u, by default False.
- remove_edge(u, v)#
Remove directed edge.
- remove_edges_from(ebunch)#
Remove directed edges.
- remove_node(n)#
Remove node in causal graphs.
- remove_nodes_from(ebunch)#
Remove nodes from causal graph.
- sample(n=1000)#
Sample from a graph.
- set_nodes_as_latent_confounders(nodes)#
Set nodes as latent unobserved confounders.
Note that this only works if the original node is a common cause of some variables in the graph.
- size(weight=None)#
Return the total number of edges possibly with weights.
- soft_do(nodes, dependencies='original')#
Apply a soft-intervention on node to causal graph.
- Parameters:
nodes : nodes
A node within the graph.
dependencies :
list
of nodes |str
, optionalWhat dependencies are now relevant for the node, by default ‘original’, which keeps all original directed edges (this still removes the bidirected edges). If a list of nodes, then it will add directed edges from those nodes to the node.
- Returns:
causal_graph :
ADMG
The mutilated graph.
- subgraph(nodes)#
Create a causal subgraph of just certain nodes.
- successors(u)#
Return successors of node u.
A successor is defined as nodes with a directed edge from ‘u’. That is ‘u’ -> ‘v’. A bidirected edge would not qualify as a successor.
- to_adjacency_graph()#
Compute an adjacency undirected graph.
Two nodes are considered adjacent if there exist any type of edge between the two nodes.
- Return type:
- to_dot_graph(to_dagitty=False)#
Convert to ‘dot’ graph representation as a string.
The DOT language for graphviz is what is commonly used in R’s
dagitty
package. This is a string representation of the graph. However, this converts to a string format that is not 100% representative of DOT [1]. See Notes for more information.- Parameters:
to_dagitty : bool
Whether to conform to the Dagitty format, where the string begins with
dag {
instead ofstrict digraph {
.- Returns:
dot_graph :
str
A string representation in DOT format for the graph.
Notes
The output of this function can be immediately plugged into the dagitty online portal for drawing a graph.
For example, if we have a mixed edge graph, with directed and bidirected arrows (i.e. a causal DAG). Specifically, if we had
0 -> 1
with a latent confounder, we would get the following output:strict digraph { 0; 1; 0 -> 1; 0 <-> 1; }
To represent for example a bidirected edge,
A <-> B
, the DOT format would make you useA -> B [dir=both]
, but this is not as intuitive.A <-> B
also complies with dagitty and other approaches to drawing graphs in Python/R.References
- to_networkx()#
Converts causal graphs to networkx.
- to_numpy()#
Convert to a matrix representation.
A single 2D numpy array is returned, since a PAG only maps one edge between any two nodes.
- Returns:
numpy_graph :
np.ndarray
of shape (n_nodes, n_nodes)The causal graph with values specified as a string character. For example, if A has a directed edge to B, then the array at indices for A and B has
'->'
.
Notes
In R’s
pcalg
package, the following encodes the types of edges as an array. We will follow the same encoding for our numpy array representation.References
- tomag()#
Convert corresponding causal DAG to a MAG.