**2.5 Identifiability in the presence of hidden variables**

**2.5.4 A somewhat more involved example**

Consider the graphG in Figure 2.7, which was discussed in Pearl (2014).
It is easily shown that P(Y|do(A = a)) cannot be identified by covariate
adjustment.10 That is, the non-causal path A←U3 →C3 →Y can only be
10_{The steps below can easily be followed using DAGitty, a browser-based environment for}

creating, editing, and analyzing causal models (Textor et al., 2011). GraphGin Figure 2.7 can be loaded from this url: http://dagitty.net/mMdmQxs

### 2

G A M Y C1 C2 C3 U1 U2 U3 GD M Y C1 C3 GV\A M Y C1 C2 C3 U1 U2 GS2 A M C2 C3 U1 U2 U3**Figure 2.7:**A somewhat more involved graph_{G}and three of its subgraphs required
**for application of the ID algorithm.**

blocked by C3. Adjusting for C3also blocks the non-causal path A← C2 ←
U1→C3 →Y. However, since C3is a collider, adjusting for it would open
a spurious pathway, i.e. A _{←}U3 → C3 ←C1→ M →Y.11 This spurious
pathway can again be blocked upon adjusting for C1. This leaves us with
one remaining non-causal path, i.e. A _{←}C2 ←U2 → M →Y, which can
only be blocked by C2. However, since C2is also a collider, adjusting for
it opens yet another spurious pathway that passes collider C3, which is
already adjusted for, i.e. A ← U3 → C3 ←U1 → C2 ← U2 → M → Y.
The only way to block this spurious pathway would be to adjust for M.
However, this would imply blocking a proper causal path that we are
interested in. Indeed, the adjustment criterion dictates that no element in
the adjustment set lies on a proper causal path from A to Y.

### 2

Nevertheless, non-parametric identification of P(Y_{|}do(A =a))can be
**obtained using the Tian’s ID algorithm (Figure 2.5).**12 _{Intuitively, this may}
be appreciated by the fact that progress can be made by relying on exclusion
restrictions encoded in the graph in Figure 2.7. Indeed, below, we illus-
trate that, just as the exclusion restrictions that A does not affect Y other
than through M and that C does not affect M other than through A in the
simple front-door example in section 2.5.1 enabled us to make progress,
reliance on similar exclusion restrictions may aid in obtaining identification
of P(Y_{|}do(A=a))in the graph in Figure 2.7.

The districts in the original graph_{G} are S1={C1}, S2 ={C2, C3, A, M}
and S3 ={Y}. Their corresponding c-factors can be obtained by applying
Lemma 2.5.1. Because S1and S3are singletons, their c-factors have a unique
expression, which can easily be obtained by Lemma 2.5.1 as Q[S1] = P(C1)
and Q[S3] = P(Y|A, M, C3), respectively. The second district, S2, on the
other hand, consists of multiple nodes, which, moreover, are subject to
multiple possible topological orders. That is, within S2, there is no order
restriction with respect to C3. In other words, whereas C2strictly precedes
A, which, in turn, strictly precedes M, i.e. C2< A <M, the location of C3
within the topological ordering is unconstrained: it may precede or succeed
any of the other nodes in S2. This observation may be exploited later on
when we need to sum out Q[S2] over certain variables in order to obtain
c-factors of some of the districts in the subgraphGD. In particular, we will
need to cleverly choose two specific topological orderings in order to make
progress. First, according to the ordering C1 < C3 < C2 < A < M < Y,
Q[S2]can be expressed as

P(C3|C1)P(C2|C1, C3)P(A|C1, C2, C3)P(M|A, C1, C2, C3) (2.18) by Lemma 2.5.1, whereas the ordering C1 <C2 < A< M <C3 <Y enables

12_{Identification results from both the ID algorithm and the IDC algorithm, discussed in}

the next section, can be obtained using the R package causaleffect (Tikka, 2016). The added value of this package follows from the fact that applying these algorithms ‘by hand’ can be tedious, as illustrated in this and the next section.

### 2

us to express Q[S2]as

P(C2)P(A|C2)P(M|A, C1, C2)P(C3|A, M, C1, C2). (2.19)

In the subgraph_{G}_{V}_{\}_{A}(Figure 2.7), C2is no longer an ancestor of Y, such
that D = _{{}C1, C3, M, Y}. The resulting subgraphGD (Figure 2.7) has four
districts, i.e. D1 ={C1}, D2 ={C3}, D3= {M} ⊂ S2and D4={Y}, such
that P(Y|do(A= a))can be expressed as

### ∑

c1,c3,m Q[{C1}]Q[{C3}]Q[{M}]Q[{Y}] =### ∑

c1,c3,m P(C1 =c1)P(C3=c3|do(C1=c1))P(M =m|do(A= a, C1 =c1)) ×P(Y|do(A=a, M=m, C3=c3)). (2.20) Since D1 = S1 and D4 = S3, their corresponding c-factors will also be identical, i.e. Q[D1] = Q[S1] = P(C1)and Q[D4] =Q[S3] = P(Y|A, M, C3).Obtaining the c-factors of D2and D3– both of which are subsets of S2–
will, however, be somewhat more involved, as this involves application of
**the Identify algorithm (Figure 2.6). Since C**3only has itself as an ancestor
in the subgraph GS2 (Figure 2.7), obtaining Q[D2]is relatively simple, as

further instructions are then indicated by the first bullet in Figure 2.6. That
**is, Identify(D**2, S2, Q[S2])yields Q[D2] = ∑c2,a,mQ[S2], which, by expres-

sion (2.18) reduces to

### ∑

c2,a,m

P(C3|C1)P(C2=c2|C1, C3)P(A =a|C1, C2=c2, C3)

×P(M =m|A =a, C1, C2 =c2, C3) = P(C3|C1). (2.21) Note that the careful choice of letting C3precede the other nodes in S2in the topological ordering C1 <C3 < C2 < A < M <Y, indeed leads to an expression for Q[S2]which can easily be summed over the other variables in S2.

Obtaining Q[D3], on the other hand, is quite tedious, because it involves
**recursive applications of the Identify algorithm. To see why, note that the**
set of ancestors of M inGS2 corresponds to{C2, A, M}, which, in turn, is a

### 2

in the subgraph_{G}_{{}_{C}_{2}_{,A,M}_{}}, M is contained in the district_{{}C2, M}. Before we
can compute Q[{C2, M}]we first need to obtain Q[{C2, A, M}] = ∑c3Q[S2].

By expression (2.19), the latter can simply be expressed as

### ∑

_{c}

3

P(C2)P(A|C2)P(M|A, C1, C2)P(C3=c3|A, M, C1, C2)

=P(C2)P(A|C2)P(M|A, C1, C2). (2.22)
However, obtaining Q[_{{}C2, M}]from Q[{C2, A, M}], now requires applica-
tion of Lemma 2.5.2, which is a more complex variant of Lemma 2.5.1.

**Lemma 2.5.2.** (Tian and Pearl, 2003) Let H _{⊆} V, and assume that H is parti-

tioned into c-components H1, ..., Hlin the subgraphGH. Then we have (i) Q[H]decomposes as

Q[H] =

### ∏

iQ[Hi].

(ii) Each Q[Hi]is computable from Q[H]. Let k be the number of variables in H, and let a topological order of the variables in H be Vm1 < ... <Vmk in

GH. Let H(i) ={Vm1, ..., Vmi}be the set of variables in H ordered before Vmi

(including Vmi), i=1, ..., k, and H(0) =∅. Then each Q[Hj], j=1, ..., l is

given by

Q[H_{j}] =

### ∏

i|V_{mi}∈Hj

Q[H(i)_{]}

Q[H(i−1)_{]},

where each Q[H(i)_{], i} _{=}_{1, ..., k, is given by}

Q[H(i)_{] =}

_{∑}

h\h(i)

Q[H].

Applying this lemma, we get H = {C2, A, M} with C2 < A < M and only admissible topological order. We then get

Q[{C2, M}] = Q[{C2, A, M_{Q[C} }]Q[{C2}]
2, A]

### 2

= Q[{C2, A, M}]∑a,mQ[{C2, A, M}] ∑mQ[{C2, A, M}] = P(C2)P(A|C2)P(M|A, C1, C2)P(C2) P(C2)P(A|C2) = P(C2)P(M|A, C1, C2). (2.23) As a last step, we still need to obtain Q[D3] = Q[{M}] from Q[{C2, M}]**by invoking Identify(**M,

_{{}C2, M}, Q[{C2, M}]). Since M does not have any ancestors (except itself) in the subgraph

_{G}

_{{}

_{C}

_{2}

_{,M}

_{}}, we get

Q[{M}] =

### ∑

c2Q[{C2, M}] =

### ∑

c2P(C2=c2)P(M|A, C1, C2 =c2). (2.24)

It follows that, since every Q[D_{i}] is identifiable, P(Y|do(A = a)) is
also identifiable. Its identification result can be obtained by putting all
pieces together and substituting every Q[Di] in expression (2.20) by its
corresponding functional of the observed data. We hence obtain:

P(Y|do(A= a)) =

### ∑

c1,c3,m P(C1 =c1)P(C3 =c3|C1 =c1)### ∑

c2 P(C2=c2) ×P(M =m|A=a, C1=c1, C2=c2) ×P(Y|A =a, M=m, C3=c3) =### ∑

c1,c2,c3,m P(Y_{|}A=a, M =m, C3 =c3) ×P(M =m

_{|}A =a, C1=c1, C2=c2) ×P(C1 =c1)P(C2=c2)P(C3 =c3|C1 =c1). (2.25)