A somewhat more involved example

In document Flexible causal mediation analysis using natural effect models (Page 51-56)

2.5 Identifiability in the presence of hidden variables

2.5.4 A somewhat more involved example

Consider the graphG in Figure 2.7, which was discussed in Pearl (2014). It is easily shown that P(Y|do(A = a)) cannot be identified by covariate adjustment.10 That is, the non-causal path A←U3 →C3 →Y can only be 10The steps below can easily be followed using DAGitty, a browser-based environment for

creating, editing, and analyzing causal models (Textor et al., 2011). GraphGin Figure 2.7 can be loaded from this url: http://dagitty.net/mMdmQxs


G A M Y C1 C2 C3 U1 U2 U3 GD M Y C1 C3 GV\A M Y C1 C2 C3 U1 U2 GS2 A M C2 C3 U1 U2 U3

Figure 2.7:A somewhat more involved graphGand three of its subgraphs required for application of the ID algorithm.

blocked by C3. Adjusting for C3also blocks the non-causal path A← C2 ← U1→C3 →Y. However, since C3is a collider, adjusting for it would open a spurious pathway, i.e. A U3 → C3 ←C1→ M →Y.11 This spurious pathway can again be blocked upon adjusting for C1. This leaves us with one remaining non-causal path, i.e. A C2 ←U2 → M →Y, which can only be blocked by C2. However, since C2is also a collider, adjusting for it opens yet another spurious pathway that passes collider C3, which is already adjusted for, i.e. A ← U3 → C3 ←U1 → C2 ← U2 → M → Y. The only way to block this spurious pathway would be to adjust for M. However, this would imply blocking a proper causal path that we are interested in. Indeed, the adjustment criterion dictates that no element in the adjustment set lies on a proper causal path from A to Y.


Nevertheless, non-parametric identification of P(Y|do(A =a))can be obtained using the Tian’s ID algorithm (Figure 2.5).12 Intuitively, this may be appreciated by the fact that progress can be made by relying on exclusion restrictions encoded in the graph in Figure 2.7. Indeed, below, we illus- trate that, just as the exclusion restrictions that A does not affect Y other than through M and that C does not affect M other than through A in the simple front-door example in section 2.5.1 enabled us to make progress, reliance on similar exclusion restrictions may aid in obtaining identification of P(Y|do(A=a))in the graph in Figure 2.7.

The districts in the original graphG are S1={C1}, S2 ={C2, C3, A, M} and S3 ={Y}. Their corresponding c-factors can be obtained by applying Lemma 2.5.1. Because S1and S3are singletons, their c-factors have a unique expression, which can easily be obtained by Lemma 2.5.1 as Q[S1] = P(C1) and Q[S3] = P(Y|A, M, C3), respectively. The second district, S2, on the other hand, consists of multiple nodes, which, moreover, are subject to multiple possible topological orders. That is, within S2, there is no order restriction with respect to C3. In other words, whereas C2strictly precedes A, which, in turn, strictly precedes M, i.e. C2< A <M, the location of C3 within the topological ordering is unconstrained: it may precede or succeed any of the other nodes in S2. This observation may be exploited later on when we need to sum out Q[S2] over certain variables in order to obtain c-factors of some of the districts in the subgraphGD. In particular, we will need to cleverly choose two specific topological orderings in order to make progress. First, according to the ordering C1 < C3 < C2 < A < M < Y, Q[S2]can be expressed as

P(C3|C1)P(C2|C1, C3)P(A|C1, C2, C3)P(M|A, C1, C2, C3) (2.18) by Lemma 2.5.1, whereas the ordering C1 <C2 < A< M <C3 <Y enables

12Identification results from both the ID algorithm and the IDC algorithm, discussed in

the next section, can be obtained using the R package causaleffect (Tikka, 2016). The added value of this package follows from the fact that applying these algorithms ‘by hand’ can be tedious, as illustrated in this and the next section.


us to express Q[S2]as

P(C2)P(A|C2)P(M|A, C1, C2)P(C3|A, M, C1, C2). (2.19)

In the subgraphGV\A(Figure 2.7), C2is no longer an ancestor of Y, such that D = {C1, C3, M, Y}. The resulting subgraphGD (Figure 2.7) has four districts, i.e. D1 ={C1}, D2 ={C3}, D3= {M} ⊂ S2and D4={Y}, such that P(Y|do(A= a))can be expressed as

c1,c3,m Q[{C1}]Q[{C3}]Q[{M}]Q[{Y}] =

c1,c3,m P(C1 =c1)P(C3=c3|do(C1=c1))P(M =m|do(A= a, C1 =c1)) ×P(Y|do(A=a, M=m, C3=c3)). (2.20) Since D1 = S1 and D4 = S3, their corresponding c-factors will also be identical, i.e. Q[D1] = Q[S1] = P(C1)and Q[D4] =Q[S3] = P(Y|A, M, C3).

Obtaining the c-factors of D2and D3– both of which are subsets of S2– will, however, be somewhat more involved, as this involves application of the Identify algorithm (Figure 2.6). Since C3only has itself as an ancestor in the subgraph GS2 (Figure 2.7), obtaining Q[D2]is relatively simple, as

further instructions are then indicated by the first bullet in Figure 2.6. That is, Identify(D2, S2, Q[S2])yields Q[D2] = ∑c2,a,mQ[S2], which, by expres-

sion (2.18) reduces to


P(C3|C1)P(C2=c2|C1, C3)P(A =a|C1, C2=c2, C3)

×P(M =m|A =a, C1, C2 =c2, C3) = P(C3|C1). (2.21) Note that the careful choice of letting C3precede the other nodes in S2in the topological ordering C1 <C3 < C2 < A < M <Y, indeed leads to an expression for Q[S2]which can easily be summed over the other variables in S2.

Obtaining Q[D3], on the other hand, is quite tedious, because it involves recursive applications of the Identify algorithm. To see why, note that the set of ancestors of M inGS2 corresponds to{C2, A, M}, which, in turn, is a


in the subgraphG{C2,A,M}, M is contained in the district{C2, M}. Before we can compute Q[{C2, M}]we first need to obtain Q[{C2, A, M}] = ∑c3Q[S2].

By expression (2.19), the latter can simply be expressed as



P(C2)P(A|C2)P(M|A, C1, C2)P(C3=c3|A, M, C1, C2)

=P(C2)P(A|C2)P(M|A, C1, C2). (2.22) However, obtaining Q[{C2, M}]from Q[{C2, A, M}], now requires applica- tion of Lemma 2.5.2, which is a more complex variant of Lemma 2.5.1.

Lemma 2.5.2. (Tian and Pearl, 2003) Let H V, and assume that H is parti-

tioned into c-components H1, ..., Hlin the subgraphGH. Then we have (i) Q[H]decomposes as

Q[H] =



(ii) Each Q[Hi]is computable from Q[H]. Let k be the number of variables in H, and let a topological order of the variables in H be Vm1 < ... <Vmk in

GH. Let H(i) ={Vm1, ..., Vmi}be the set of variables in H ordered before Vmi

(including Vmi), i=1, ..., k, and H(0) =∅. Then each Q[Hj], j=1, ..., l is

given by

Q[Hj] =




where each Q[H(i)], i =1, ..., k, is given by

Q[H(i)] =



Applying this lemma, we get H = {C2, A, M} with C2 < A < M and only admissible topological order. We then get

Q[{C2, M}] = Q[{C2, A, MQ[C }]Q[{C2}] 2, A]


= Q[{C2, A, M}]∑a,mQ[{C2, A, M}] ∑mQ[{C2, A, M}] = P(C2)P(A|C2)P(M|A, C1, C2)P(C2) P(C2)P(A|C2) = P(C2)P(M|A, C1, C2). (2.23) As a last step, we still need to obtain Q[D3] = Q[{M}] from Q[{C2, M}] by invoking Identify(M,{C2, M}, Q[{C2, M}]). Since M does not have any ancestors (except itself) in the subgraphG{C2,M}, we get

Q[{M}] =


Q[{C2, M}] =


P(C2=c2)P(M|A, C1, C2 =c2). (2.24)

It follows that, since every Q[Di] is identifiable, P(Y|do(A = a)) is also identifiable. Its identification result can be obtained by putting all pieces together and substituting every Q[Di] in expression (2.20) by its corresponding functional of the observed data. We hence obtain:

P(Y|do(A= a)) =

c1,c3,m P(C1 =c1)P(C3 =c3|C1 =c1)

c2 P(C2=c2) ×P(M =m|A=a, C1=c1, C2=c2) ×P(Y|A =a, M=m, C3=c3) =

c1,c2,c3,m P(Y|A=a, M =m, C3 =c3) ×P(M =m|A =a, C1=c1, C2=c2) ×P(C1 =c1)P(C2=c2)P(C3 =c3|C1 =c1). (2.25)

In document Flexible causal mediation analysis using natural effect models (Page 51-56)