# ananke.models

## ananke.models.binary_nested

Implements the Nested Markov parameterization for acyclic directed mixed graphs over binary variables.

The code relies on the following papers:

[ER12] Evans, R. J., & Richardson, T. S. (2012). Maximum likelihood fitting of acyclic directed mixed graphs to binary data. arXiv preprint arXiv:1203.3479. [ER19] Evans, R. J., & Richardson, T. S. (2019). Smooth, identifiable supermodels of discrete DAG models with latent variables. Bernoulli, 25(2), 848-876.

class ananke.models.binary_nested.BinaryNestedModel(graph)[source]

Bases: object

Estimates parameters of nested binary model using the iterative maximum likelihood algorithm (Algorithm 1) of [ER12]. Performs scipy.minimize constrained minimization of negative log-likelihood.

estimate(treatment_dict=None, outcome_dict=None, check_identified=True)[source]

Estimates p(Y(a)=y) for treatment A=a, outcome Y=y using the binary nested Markov parameters.

Parameters
• treatment_dict – dict of treatment variables to values

• outcome_dict – dict of outcome variables to values

• check_identified – boolean to check that effect is identified, default True

Returns

interventional probability p(Y(a)) if identified, else return None

fit(X, q_vector=None, tol=1e-08, *args, **kwargs)[source]

Fits the binary nested model. Let N the number of observations, M the number of variables.

Parameters
• X – Either a N x M pandas DataFrame where each row represents an observation, or an 2 ** M x (M+1) pandas DataFrame, where each row represents the observation through a count variable

• args

• kwargs

Returns

Compute M matrix as given in Section 3.1 of [ER12]

Parameters

• partition_head_dict – dictionary mapping every subset of given district to constituent heads

• district – a district of the graph

• terms – list of terms of given district

Returns

Computes P matrix as given in Section 3.1 of [ER12]

Parameters
• partition_head_dict – dictionary mapping every subset of given district to constituent heads

• q_vector_keys – list of parameter names

• terms – list of all terms for a given district

Returns

ananke.models.binary_nested.compute_all_M_and_P(G, intrinsic_dict)[source]

Computes for each district of a graph, the corresponding M and P matrices required to compute the expression

$p(q) = \prod_j M_j \exp(P_j \log q_j)$

See Section 3.1 of [ER12] for further details.

Parameters

• intrinsic_dict – mapping of intrinsic set to heads and tails

Returns

ananke.models.binary_nested.compute_counterfactual_binary_parameters(G, q_vector, x_dict, y_dict)[source]

Computes a counterfactual (interventional quantity) p(Y=y | do(X=x)) using nested Markov parameters and graph

Parameters

• q_vector – A dictionary of nested Markov parameters

• x_dict – A dictionary of treatment variables to treatment values

• y_dict – A dictionary of outcome variables to outcome values

Returns

Computed probability p(Y=y | do(X=x))

ananke.models.binary_nested.compute_district_bool_map(q_vector_keys, districts)[source]
Parameters
• q_vector_keys

• districts

Returns

Compute likelihood directly using Equation (1) of [ER12].

This likelihood is not recommended for use in maximization as it is more efficiently expressed using M and P computations.

Parameters
• nu_dict – a dictionary representing a single observation of variables to value

• q_vector – dictionary mapping the head, tail, tail value to parameter value

• district – district of graph

• intrinsic_dict – mapping of intrinsic set to heads and tails

Returns

Compute partition head dictionary. Maps every subset of a district to its constituent maximal recursive heads

Parameters
• intrinsic_dict – dictionary mapping intrinsic sets to (heads, tails) of that set

• district – district of graph

Returns

ananke.models.binary_nested.compute_q_indices_by_district(q_vector_keys, districts)[source]

Computes a boolean indexing array that indicates which q parameters are involved in which districts.

Parameters
• q_vector_keys

• districts

Returns

Computes list of terms (product of q parameters) and the partition head dictionary. Each term is a tuple of (frozenset(all heads), tuple(all tails), tuple(values of all tails)).

Parameters
• district – district of graph

• intrinsic_dict – dictionary mapping intrinsic sets to (heads, tails) of that set

Returns

ananke.models.binary_nested.compute_theta_bool_map(q_vector_keys, variables)[source]

Compute map from variable to boolean indexing vector of q_parameters which have heads containing that variable. In this map, the indices are not reindexed by district. The boolean vector selects parameters which have heads containing that variable.

Parameters
• q_vector_keys – list of q_vector keys

• variables – list of variables

Returns

ananke.models.binary_nested.compute_theta_reindexed_bool_map(q_vector_keys, districts)[source]

Creates a mapping from a variable to a boolean indexing vector.

This boolean vector indexes only parameters whose heads are involved in the district of that variable. It selects parameters which have heads containing that variable.

Used to construct A and b matrices for partial likelihood.

Parameters
• q_vector_keys – list of q_vector keys

• districts – list of districts

Returns

ananke.models.binary_nested.construct_A_b(variable, q, theta_reindexed_bool_map, M, P)[source]

Constructs A and b matrices (eqn 4, Evans and Richardson 2013) for constraining parameters of given variable, in district which admits matrices M, P

Parameters
• variable – name of variable

• q_vector – q_vector in OrderedDict format

• M – M matrix

• P – P matrix

Returns

Get mapping of heads to tails from a mapping of intrinsic sets

Parameters

intrinsic_dict – mapping of intrinsic sets of some graph to heads and tails

Returns

Compute all heads of intrinsic sets

Parameters

intrinsic_dict – mapping of intrinsic sets of some graph to heads and tails

Returns

ananke.models.binary_nested.initialize_q_vector(intrinsic_dict)[source]

Generates the q_vector, a dictionary mapping (heads: frozenset, tails: tuple, value of tail: tuple) to parameter. Default q parameter values are 1/2^#(head), for each head.

Parameters

intrinsic_dict – mapping of intrinsic sets of some graph to heads and tails

Return q_vector

Returns maximal heads for a set of heads. Only defined if all heads in list_H (list) are in IHT, ie, they are all heads of intrinsic sets

ananke.models.binary_nested.permutations(n, k=2)[source]

Computes tuples of all permutations of n variables with cardinality k.

Parameters
• n – number of variables

• k – cardinality of each variable

Returns

ananke.models.binary_nested.process_data(df, count_variable=None)[source]
Parameters
• data – pandas DataFrame columns of variables and rows of observations

• count_variable – optional name of counting variable, if data is provided as a summary table

Returns

a vector of counts, ordered in ascending order of v for p(V=v)

Partition an arbitrary (sub)set of vertices B (in V) into recursive heads

Parameters
• B – arbitrary possibly empty subset of vertices

• intrinsic_dict – map of intrinsic sets to heads and tails of that set

## ananke.models.discrete

ananke.models.discrete.compute_district_factor(graph, net, fixing_order)[source]

Compute the interventional distribution associated with a district (or equivalently, its fixing order)

Parameters
• graph (ananke.ADMG) – Graph representing the problem

• net (pgmpy.models.BayesianNetwork) – Probability distribution corresponding to the graph

• fixing_order – A fixing sequence for the implied district D

ananke.models.discrete.compute_effect_from_discrete_model(net, treatment_dict, outcome_dict)[source]

Compute the causal effect by directly performing an intervention in a Bayesian Network corresponding to the true structural equation model to obtain the counterfactual distribution, and then computing the marginal distribution of the outcome. Note that this function does not consider issues of identification as interventions are performed in the true model (regardless if those interventions were identified).

Parameters
• net – A Bayesian Network representing the causal problem. Note that this object is used only as a representation of the observed data distribution.

• treatment_dict – Dictionary of treatment variables to treatment values.

• outcome_dict – Dictionary of outcome variables to outcome values.

ananke.models.discrete.estimate_effect_from_discrete_dist(oid, net, treatment_dict, outcome_dict)[source]

Performs the ID algorithm to identify a causal effect given a discrete probability distribution representing the observed data distribution.

Parameters
• oid (OneLineID) – Ananke OneLineID object

• net – pgmpy.BayesianNetwork-like object

• treatment_dict – dictionary of treatment variables and values

• outcome_dict – dictionary of outcome variables and values

ananke.models.discrete.generate_bayesian_network(graph, cpds)[source]

Creates a Bayesian Network from Ananke graph and a list of pgmpy TabularCPDs.

Parameters
• graph – Graph

• cpds (List[TabularCPD]) – A list of conditional probability distributions consistent with ‘graph’

ananke.models.discrete.generate_random_cpds(graph, dir_conc=10, context_variable='S')[source]

Given a graph and a set of cardinalities for variables in a DAG, constructs random conditional probability distributions. Supports optional contexts and context variable to generate CPDs consistent with a context specific DAG for data fusion.

Parameters
• graph – A graph whose variables have cardinalities, and optionally

• dir_conc – The Dirichlet concetration parameter

• context_variable – Name of the context variable

ananke.models.discrete.intervene(net, treatment_dict)[source]

Performs an intervention on a pgmpy.models.BayesianNetwork, by setting the conditional distribution of each intervened variable to be a point mass at the intervened value. Does not alter the structure of the parents of the network (i.e. is a non-faithful operation).

Parameters
• net (pgmpy.models.BayesianNetwork) – Bayesian Network

• treatment_dict (dict) – dictionary of variables to values:

## ananke.models.linear_gaussian_sem

Class for Linear Gaussian SEMs parametrized by a matrix B representing regression coefficients and a matrix Omega representing correlated errors

class ananke.models.linear_gaussian_sem.LinearGaussianSEM(graph)[source]

Bases: object

bic(X)[source]

Calculate Bayesian information criterion of the data given the model.

Parameters
• X – a N x M dimensional data matrix.

• weights – optional 1d numpy array with weights for each data point (rows with higher weights are given greater importance).

Returns

a float corresponding to the log-likelihood.

draw(direction=None)[source]

Visualize the graph.

:return : dot language representation of the graph.

fit(X, tol=1e-06, disp=None, standardize=False, max_iters=100)[source]

Fit the model to data via (weighted) maximum likelihood estimation

Parameters
• X – data – a N x M dimensional pandas data frame.

• weights – optional 1d numpy array with weights for each data point (rows with higher weights are given greater importance).

Returns

self.

neg_loglikelihood(X)[source]

Calculate log-likelihood of the data given the model.

Parameters
• X – a N x M dimensional data matrix.

• weights – optional 1d numpy array with weights for each data point (rows with higher weights are given greater importance).

Returns

a float corresponding to the log-likelihood.

total_effect(A, Y)[source]

Calculate the total causal effect of a set of treatments A on a set of outcomes Y.

Parameters
• A – iterable corresponding to variable names that act as treatments.

• Y – iterable corresponding to variable names that act as outcomes.

Returns

a float corresponding to the total causal effect.

ananke.models.linear_gaussian_sem.is_positive_definite(X)[source]