**3.7 Path-specific effects**

**3.7.3 Costs of fine-grained decompositions: assumptions**

An appealing side effect of adopting the recanting district criterion is that it makes explicit formulations of cross-world independence assumptions redundant. One should not forget, though, that non-parametric identifi- cation of generally defined path-specific effects relies on such untestable assumptions, just as for natural effects. Their precise nature can be shown to vary depending on the path-specific effect of interest that one aims to iden- tify (Shpitser, 2013). Moreover, the number of such assumptions on which identification relies, increases with the number of component effects (e.g., if one aims to obtain fine-grained decompositions into more than two com- ponent effects). This proliferation of untestable cross-world assumptions also makes it harder to completely avoid such assumptions via particular, feasible experimental designs which enable interventionist interpretations of generally defined path-specific effects, as discussed for natural effects in section 3.6.2. A detailed discussion of deterministic mediating instruments for generally defined path-specific effects is, however, beyond the scope of this chapter (although see Robins and Richardson, 2010).

19_{An additional rationale for focusing on identification by generalizations of the adjust-}

ment formula, beyond its standard form results, is that, as indicated in section 3.4.4, the adjustment criterion also serves to identify stratum-specific natural effects.

### 3

**3.7.4** **Costs of fine-grained decompositions: interpretation**

Fine-grained decompositions into more than two path-specific effects along multiple causally ordered mediators also bring about additional conceptual or interpretational challenges. For instance, in chapter 5, we indicate that the aforementioned three-way decomposition into path-specific effects (in Figure 3.3A) can be parameterized by a natural effect model for the mean of recursively nested counterfactual outcomes

E{Y(a, L(a0_{), M(a}00_{, L(a}0_{)))}_{} =} _{g}−1_{{}

*γ*0+*γ*1a+*γ*2a0+*γ*3a00

+*γ*4aa0+*γ*5aa00+*γ*6a0a00+*γ*7aa0a00}.
This model highlights that, in total, six possible three-way decompositions
*can be obtained by differently apportioning the interaction parameters γ*4to
*γ*7. These decompositions involve four distinct instances of each of the path-
specific effects of interest. For instance, if g(_{·)}corresponds to the identity
link function, the path-specific effects of the particular three-way decom-
position discussed in section 3.7.1, as defined by expressions (3.10), (3.11)
*and (3.12), are captured by γ*2+*γ*4+*γ*6+*γ*7*, γ*1*, and γ*3+*γ*5, respectively.
Depending on the levels at which a, a0_{and a}00_{are set, other results might be}

obtained. For instance, the partial indirect effect with respect to M can be
*defined as (3.13), instead of (3.12), which is captured by γ*3+*γ*5+*γ*6+*γ*7.
In general, in the presence of k causally ordered mediators, maximally
k+1 fine-grained path-specific effects are (possibly) non-parametrically
identifiable,20 each of which can be defined in 2k different ways. This
multitude of definitions gives rise to(k+1)! different ways in which path-
specific effects of interest can be combined to produce the total treatment
effect (also see Daniel et al., 2015). Differences in interpretation between
distinct instances of a path-specific effect may, however, be subtle and often a
substantive motivation may be lacking to prefer one specific decomposition
20_{Specifically, in chapter 5, we target identification of the most fine-grained}_{(}_{k}_{+}_{1}_{)}_{-way}

decomposition characterized in terms of k path-specific effects identifiable by the recanting witness criterion (Avin et al., 2005). In the absence of unmeasured confounding, we may indeed obtain(k+1)-way decompositions (since the recanting district criterion reduces to the recanting witness criterion in that case). In semi-Markovian NPSEMs, on the other hand, we may need to settle with coarser decompositions (see section 3.7.2).

### 3

over another. The absence of interactions between path-specific effects can thus substantially facilitate interpretations of the targeted effects (Daniel et al., 2015).

Accordingly, in chapter 5, we indicate that flexible and parsimonious modeling and estimation approaches seem unavoidable to reduce increas- ing complexity in the face of multiple causally ordered mediators to more manageable proportions. These enable assessing evidence of interaction, and expressing effects on the scale at which the evidence of interaction is weak. Alternatively, if interest lies in only one – or fewer – path-specific effects, as is often the case in practice, one may redirect focus on coarser and therefore less ambitious decompositions that involve only those spe- cific component effects of interest. These may often be identifiable under weaker conditions, as discussed in section 3.7.2, and moreover increase interpretability.

**Flexible mediation analysis with a**

**single mediator**

This chapter is based on the following paper: Steen, J., Loeys, T., Moerkerke, B., Vansteelandt, S. (2016). Medflex: An R Package for Flexible Mediation Analysis Using Natural Effect Models. Journal of Statistical Software, in press.

Mediation analysis is routinely adopted by researchers from a wide range of applied disciplines as a statistical tool to disentangle the causal pathways by which an exposure or treatment affects an outcome. The coun- terfactual framework provides a language for clearly defining path-specific effects of interest and has fostered a principled extension of mediation analy- sis beyond the context of linear models. This chapter describes medflex, an R package that implements some recent developments in mediation analysis embedded within the counterfactual framework. The medflex package offers a set of ready-made functions for fitting natural effect models, a novel class of causal models which directly parameterize the path-specific effects of interest, thereby adding flexibility to existing software packages for media- tion analysis, in particular with respect to hypothesis testing and parsimony. In this chapter, we give a comprehensive overview of the functionalities of the medflex package.

### 4

**4.1**

**Introduction**

Empirical studies often aim at gaining insight into the underlying mech- anisms by which an exposure or treatment affects an outcome of interest. Mediation analysis, as popularized in psychology and the social sciences by Judd and Kenny (1981) and Baron and Kenny (1986), has been widely adopted as a statistical tool to shed light on these mechanisms, by enabling the decomposition of total causal effects into an indirect effect through a hypothesized intermediate variable or mediator and the remaining direct ef- fect. Although its initial formulations were restricted to the context of linear regression models, several attempts have been made to extend the applica- tion of traditional estimators for indirect effects (i.e. product-of-coefficients and difference-in-coefficients estimators) beyond linear settings (e.g. MacK- innon and Dwyer, 1993; MacKinnon et al., 2007; Hayes and Preacher, 2010; Iacobucci, 2012). However, these extensions lack formal justification and yield effect estimates that are often difficult to interpret (e.g. Pearl, 2012).

Recent advances from the causal inference literature (e.g. Albert, 2008; Albert and Nelson, 2011; Avin et al., 2005; Imai et al., 2010b; Pearl, 2001, 2012; Robins and Greenland, 1992; VanderWeele and Vansteelandt, 2009, 2010) have furthered these developments and improved both inference and interpretability of direct and indirect effect estimates in nonlinear settings by building on the central notion of counterfactual or potential outcomes. This notion provides a framework that has aided in (i) formally defining direct and indirect effects (in a way that is not tied to a specific statistical model), (ii) describing the conditions required for their identification (unveiling and formalizing often implicitly made causal assumptions) and (iii) assessing the robustness of empirical findings against violations of these identification conditions (i.e. sensitivity analysis).

For instance, Imai et al. (2010a) proposed mediation analysis techniques that can be applied within a larger class of nonlinear models. They imple- mented these in a user-friendly R package, called mediation (Tingley et al., 2014b; see Hicks and Tingley, 2011 for a version in Stata (StataCorp, 2013) with more limited functionality). More recently, Valeri and VanderWeele (2013) reviewed the latest developments in mediation analysis for non-

### 4

linear models, focusing on exposure-mediator interactions, and provided SAS(SAS Institute Inc., 2014) and SPSS (IBM Corporation, 2013) macros, enabling practitioners to easily conduct these methods using well-known commercial packages. Similarly, Emsley and Liu (2013) and Muth´en and Asparouhov (2015) described how direct and indirect effects as defined in the counterfactual framework can be estimated in Stata and via extended types of structural equation models in Mplus (Muth´en and Muth´en, 2012), respectively.

In this chapter, we introduce medflex (Steen et al., 2015), an R package that enables flexible estimation of direct and indirect effects while accommo- dating some of the limitations of other available packages. More specifically, we make use of novel so-called natural effect models (Lange et al., 2012, 2014; Loeys et al., 2013; Vansteelandt et al., 2012b), which directly parameter- ize the target causal estimands on their most natural scale. This renders formal testing and interpretation more straightforward compared to other approaches as implemented in the aforementioned software applications. The medflex package is freely available from the Comprehensive R Archive Network (CRAN) athttp://cran.r-project.org/package=medflex(R Core

Team, 2015).

Throughout, the functionalities of the medflex package will be illus- trated using data from a survey study that was part of the Interdisciplinary Project for the Optimization of Separation trajectories (Ghent University and Catholic University of Louvain, 2010). This large-scale project involved the recruitment of individuals who divorced between March 2008 and March 2009 in four major courts in Flanders. It aimed to improve the quality of life in families during and after the divorce by translating research findings into practical guidelines for separation specialists (such as lawyers, judges, psy- chologists, welfare workers...) and by promoting evidence-based policy. The corresponding dataset UPBdata is included in the package and involves a subsample of 385 individuals who responded to a battery of questionnaires related to romantic relationship characteristics (such as adult attachment style) and breakup characteristics (such as breakup initiator status, experienc- ing negative affectivity and engaging in unwanted pursuit behaviors; UPB) (De Smet et al., 2012). Respondents were asked to imagine their former

### 4

partner as well as possible and to remember how they generally felt in their relationship before the breakup when completing the attachment style questionnaire. The mediation hypothesis of interest concerned the question whether the level of emotional distress or negative affectivity experienced during the breakup can be regarded as an intermediate mechanism (M) through which attachment style towards the ex-partner before the breakup (A) exerts its influence on displaying UPBs after the breakup (Y) (Loeys et al., 2013).

In the next section, we briefly introduce the mediation formula (Pearl, 2001, 2012; Petersen et al., 2006; Imai et al., 2010b), which is the predomi- nant vehicle for effect decomposition within the counterfactual framework. Advantages of natural effect models over direct application of the media- tion formula will also be discussed in more detail. We then focus on two missing data techniques for fitting these models and demonstrate how these approaches can be implemented in R using the medflex package (section 4.3). Next, we demonstrate how different types of exposure and mediator vari- ables can be dealt with (section 4.4) and how to assess effect modification of natural effects (i.e. exposure-mediator interactions and moderated me- diation) (section 4.5). Tools are provided for calculating and visualizing different causal effects estimates (section 4.6) and for estimating population- average natural effects (section 4.7) and natural indirect effects as defined through multiple intermediate pathways jointly (section 4.8). In section 4.9, we further elaborate on modeling demands and missing data, two aspects that may need to be taken into consideration by practitioners when choos- ing between the two main estimation approaches offered by the package. Finally, we conclude with some final remarks and list some extensions of the package which are planned to be implemented in the near future (section 4.10).