Latent factor regression with batch effects

Consider vectorsxi = (xi1,xi2, . . . ,xip) >

∈ p, observed fori = 1, . . . ,nindividuals.

The factor regression model definesxias a regression onpv observed covariates denoted byvi ∈ pv, andq low-dimensional latent variables denoted zi ∈ q, also known as latent coordinates or factors. The standard factor regression model is

xi =θvi+Mzi+ei, (3.1) whereθ ∈ p×pv is the matrix of regression coefficients, andM,z_i ande_i are as in the

standard FA model (2.1) .

Equation (3.1) regresses the observed dataX on known covariates and on a latent factor structure. In particular, it allows additive batch effects to be accounted for by in- corporating the variables recording the batches intovi. However, in practice one often observes more complex batch effects; for example in bioinformatics it is common to ob- serve multiplicative effects on the variance (Johnson et al., 2007). We will later describe an example of this, shown in Figure 30. Such artefacts cannot be captured by (3.1) given thatΣis assumed constant across all individuals.

To address this issue we extend (3.1) by allowingΣto depend oni. Suppose the data were obtained inp_bbatches, e.g. from different days, laboratories or instrumental calibra- tions, withn_l individuals in batchl, forl = 1, . . . ,p_b, such thatn₁+n₂+· · ·+n_p_b =n. Letbi be the indicator vector of lengthpb defined asbil := 1if individualiis in batchl,

b_il :=0otherwise.

We incorporate batch effects by adding a mean and variance adjustment. We let xi =θvi+Mzi+βbi+ei, (3.2) whereθ, vi, M and zi are as (3.1), β ∈ p×pb captures additive batch effects and the

variance ofei captures multiplicative batch effects. We denote byτjl,j = 1, . . . ,p and

l =1, . . . ,p_b as thejthidiosyncratic precision element in batchl. Then, givenb_il =1, the

errors are independently distributed aseij ∼ N(0,τjl−1). Further, denote byT thep×pb matrix that hasτ_jl as its(j,l)element.

To help interpret the practical implications of the model, suppose that one has or- thonormal factor loadingsM>_M ₌

I. Then (3.2) implies

zi =M>(xi− (θvi+βbi+ei)) (3.3) and thus,E(zi | xi,vi,bi,M,θ,β) = M>xi −M>θvi −M>βbi. That is, the mean of the latent coordinates is the projectionM>

xi plus a translation given by the batch effect adjustment and (potentially) the observed covariates. An interesting observation is that their covariance Cov(z_i | x_i,v_i,b_i,M,θ,β,T )=M>T_b−1

i M depends on the multiplicative

batch-dependent noise. As an example, Figure 30(b) show the two first factors of an ovar- ian dataset pre-processed by ComBat. Relative to the unadjusted Figure 30(a), ComBat removes systematic differences in mean and variance accross the 2 batches, however the latent coordinates exhibit distinct covariances. To obtain suitably-adjusted low-dimension coordinates one should estimateT jointly with(M,θ,β).

Model (3.2) can be represented in matrix notation as

X =Vθ>₊_ZM>₊_Bβ>₊_E_, _(3.4)

whereE ∈n×p is the matrix of errors.

As mentioned in Section 2.2, the latent factor model is non-identifiable up to orthogonal transformations, of the formM∗> ₌_A>_M>_and_Z∗₌_ZA_{, where}_A_{is any orthogonal} q×qmatrix. Through this chapter we follow the same strategy as in the previous chapter, inducing sparse solutions via local and non-local penalties.

3.3 Prior formulation

To complete Model (3.2) we set priors for the loadingsM, precisionsτ_jl, and regression parameters(θ,β). Through our proposed default prior formulation we assume that the columns inX have been centred to zero mean and unit variance. For the idiosyncratic precisionsτ_jl we set

xi bi vi zi M (θ_j,β_j) τjl µ₍_θ_,_β₎ ψ(θ,β) η ξ n p_b p

(a) Flat prior

xi bi vi zi (θ_j,β_j) τjl mjk γ·k ζk aζ b_ζ µ(θ,β) ψ(θ,β) n q λ0 λ1 η ξ pb p (b) Spike-and-slab prior Figure 21. Directed acyclic graph (DAG) for Bayesian factor regression with Batch Effect correction for different prior formulation: (a) Flat or non-sparse loading matrix. (b) Spike-and-slab or sparse loading matrix.

independently acrossj=1, . . . ,pandl =1, . . . ,p_b. By default in our examples we set the

fairly informative valuesη=ξ =1, leading to diffuse though proper priors.

For the regression parameters we set

(θ_j,β_j) ∼N(0,ψI), j =1, . . . ,p (3.6)

whereψis a user-defined prior dispersion that in our examples by default we set toψ =1.

The choice ofψ = 1assigns the same marginal prior variances to elements in(θ_j,β_j)as the unit information prior often adopted as a default for linear regression (Schwarz, 1978). We remark that this prior does not encourage sparsity in the regression parameters (θ,β), which we view as reasonable provided the number of variablesp_v and batchesp_b are moderate. For largep_v orp_b, a direct extension of our prior on the loadingsM could be adopted.

As shown in the Chapter 2, sparsity in the loadings matrixM plays a crucial role to ease interpretation. In this chapter we study the same five prior formulations:

• Flat: Equation (2.10);

• Normal-spike-and-slab (Normal-SS): Equation (2.21); • Laplace-spike-and-slab (Laplace-SS): Equation (2.31); • Normal-spike-and-MOM-slab (MOM-SS): Equation (2.36);

We finish the prior specification with a hierarchical prior over the latent indicator

γ = {γ_jk,j = 1, . . . ,p,k = 1, . . . ,q}as in (2.20) (Chapter 2) for all the spike-and-slab priors.

Figure 21 provides the DAG for the Model 3.2 for Flat and spike-and-slab priors (SS) for the loadings.

In document Factor regression for dimensionality reduction and data integration techniques with applications to cancer data (Page 73-76)