causal_networkx.discovery.PC#

class causal_networkx.discovery.PC(ci_estimator, alpha=0.05, init_graph=None, fixed_edges=None, min_cond_set_size=None, max_cond_set_size=None, max_iter=1000, max_combinations=None, apply_orientations=True, **ci_estimator_kwargs)[source]#

Methods

convert_skeleton_graph(graph)

Convert skeleton graph as undirected networkx Graph to CPDAG.

fit(X)

Fit constraint-based discovery algorithm on dataset 'X'.

learn_skeleton(X[, graph, sep_set, fixed_edges])

Learns the skeleton of a causal DAG using pairwise independence testing.

orient_edges(skel_graph, sep_set)

Orient edges in a skeleton graph to estimate the causal DAG, or CPDAG.

test_edge(data, X, Y[, Z])

Test any specific edge for X || Y | Z.

convert_skeleton_graph(graph)[source]#

Convert skeleton graph as undirected networkx Graph to CPDAG.

Parameters:

graph : nx.Graph

Converts a skeleton graph to the representation needed for PC algorithm, a CPDAG.

Returns:

graph : CPDAG

The CPDAG class.

:rtype:py:class:CPDAG
fit(X)#

Fit constraint-based discovery algorithm on dataset ‘X’.

Parameters:

X : Union[pd.DataFrame, Dict[Set, pd.DataFrame]]

Either a pandas dataframe constituting the endogenous (observed) variables as columns and samples as rows, or a dictionary of different sampled distributions with keys as the distribution names and values as the dataset as a pandas dataframe.

Raises:

RuntimeError

If ‘X’ is a dictionary, then all datasets should have the same set of column names (nodes).

Notes

Control over the constraints imposed by the algorithm can be passed into the class constructor.

Return type:

None

learn_skeleton(X, graph=None, sep_set=None, fixed_edges=None)#

Learns the skeleton of a causal DAG using pairwise independence testing.

Encodes the skeleton via an undirected graph, networkx.Graph. Only tests with adjacent nodes in the conditioning set.

Parameters:

X : pd.DataFrame

The data with columns as variables and samples as rows.

graph : nx.Graph

The undirected graph containing initialized skeleton of the causal relationships.

sep_set : set

The separating set.

fixed_edges : set, optional

The set of fixed edges. By default, is the empty set.

return_deps : bool

Whether to return the two mappings for the dictionary of test statistic and pvalues.

Returns:

skel_graph : nx.Graph

The undirected graph of the causal graph’s skeleton.

sep_set : dict of dict of set

The separating set per pairs of variables.

Raises:

ValueError

If the nodes in the initialization graph do not match the variable names in passed in data, X.

ValueError

If the nodes in the fixed-edge graph do not match the variable names in passed in data, X.

Notes

Learning the skeleton of a causal DAG uses (conditional) independence testing to determine which variables are (in)dependent. This specific algorithm compares exhaustively pairs of adjacent variables.

Return type:

Tuple[Graph, Dict[str, Dict[str, Set[Any]]], Dict[Any, Dict[Any, float]], Dict[Any, Dict[Any, float]]]

orient_edges(skel_graph, sep_set)[source]#

Orient edges in a skeleton graph to estimate the causal DAG, or CPDAG.

Uses the separation sets to orient edges via conditional independence testing. These are known as the Meek rules [1].

Parameters:

skel_graph : causal_networkx.CPDAG

A skeleton graph. If None, then will initialize PC using a complete graph. By default None.

sep_set : Dict[Dict[Set]]

The separating set between any two nodes.

:rtype:py:class:CPDAG
test_edge(data, X, Y, Z=None)#

Test any specific edge for X || Y | Z.

Parameters:

data : pd.DataFrame

The dataset

X : column

A column in data.

Y : column

A column in data.

Z : list, optional

A list of columns in data, by default None.

Returns:

test_stat : float

Test statistic.

pvalue : float

The pvalue.

Examples using causal_networkx.discovery.PC#