5.2 PROreg R-package
5.2.4 BBmm: Fit a beta-binomial mixed-effects regression model
BBmm function performs beta-binomial mixed-effects models, i.e., it allows the inclusion of Gaussian random effects in the linear predictor of a marginal beta-binomial logistic regression model in order to accommodate the correlation among the outcomes. It allows the joint estimation of more than one outcome vector in a multivariate framework.
Each component of the model can be specified by means of two different ways:
Outcomes: (i) determining the fixed.formula argument, or (ii) including the vector of the outcomes y.
Fixed part: (i) determining the fixed.formula argument, or (ii) specifying the model matrix of the covariates X.
Random part: (i) determining the random.formula argument, or (ii) spec-ifying the model matrix of the random effects, Z, and determining the number of random effects in each random component, nRandComp.
The estimation of the fixed and random effects in the model can be done by means of two approaches: (i) BB-Delta, the delta algorithm developed for the beta-binomial mixed-effects model, and (ii) using the NR R-package. The selected method must be specified in the arguments of the function.
Usage
BBmm(fixed.formula=NULL,X=NULL,y=NULL,random.formula=NULL,Z=NULL, nRandComp=NULL,m,data=list(),method="BBNR",maxiter=50,show=FALSE, nDim=1)
Arguments
fixed.formula an object of class ‘formula’ (or one that can be coerced to that class):
a symbolic description of the fixed part of the model to be fitted.
X design matrix composed by the given covariates in the model. It must be only specified in cases where the fixed.formula argument is not determined.
y the vector of the outcomes that are going to be modelled as a function of the covariates. It must be only specified in cases where the
fixed.formula argument is not determined.
random.formula an object of class ”formula” (or one that can be coerced to that class):
a symbolic description of the random part of the model to be fitted.
Z design matrix composed by the correlation, or random effects structure, of the model. It must be only specified in cases where the random.formula argument is not determined.
nRandComp the number of random effects in each random component of the model. It must be specified as a vector where the ith value corresponds with the number of random effects of the ith random component. It must be only included when the random structure of the model is described through the matrix of the random effects Z.
m maximum score number in each beta-binomial observation.
data an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model.
If not found in data, the variables are taken from environment (formula).
method the method for the estimation of the fixed and random effects in the model. Two options are available: (i) ‘BB-Delta’, the delta algorithm developed for the estimation procedure of the beta-binomial
mixed-effects regression approach; (ii) ‘NR’, general Newton-Raphson algorithm for estimating the root of a set of n (nonlinear) equations.
maxiter the maximum number of iterations in the estimation process. Default 50.
show logical, if TRUE, then the tolerance of the stop criterion together with the maximum difference of the fixed effects, beta-binomial log-dispersion parameter and random effects standard deviation with respect to the previous estimation is shown in each iteration.
nDim number of dimensions that are going to be jointly analysed. Default 1.
Details
BBmm function performs a beta-binomial mixed effects models. It extends the marginal beta-binomial logistic regression to the inclusion of random effects in the linear predictor of the model. It is assumed that, conditional on some Gaussian random effects u, each variable of the the response variable vector Y follows a beta-binomial distribution of parameters mi, pi and φi,
Yi|u ∼ BB(mi, pi, φi), u ∼ N (0, D) where
η = log
p
1 − p
= Xβ + Zu
being X and Z model matrices composed by the given covariates and random structure respectively and D(λ) is determined by some dispersion parameters λ, which are included in the parameter vector θ = (φ, λ0)0.
The estimation of the fixed regression parameters β and the prediction of the random effects u is done via the maximum likelihood, where the marginal likelihood of the model is approximated though the joint-likelihood by a first order Laplace approximation,
l(β, u, θ|y) ≈ log f (y|β, u, θ) + log f (u|θ). (5.1) The previous formula does not have a closed form and numerical methods are needed for developing a estimation procedure. Two approaches are available in the BBmm function in order to perform the fixed and random effects estimation:
(i) A special case of the delta algorithm developed for the beta-binomial mixed-effects model estimation, and (ii) a general Newton-Raphson algorithm.
The estimation of the dispersion parameters θ by the joint-likelihood may be substantially biased due to the previous estimation of the fixed and random effects. Consequently, a penalisation of the joint-likelihood must be performed in order to get an unbiased estimation of the dispersion parameters. Lee and Nelder (1996) proposed the adjusted profile h-likelihood for the correct estimation of the dispersion parameters in mixed-effects model framework,
h(θ| ˆβ, ˆu, y) = log f (y|β, u, θ) + log f (u|θ) +1
2logdet 2πH−1 , where H is the Hessian matrix of the model, i.e. the second derivatives of the
log-likelihood with respect to β and u.
The BBmm methodology iterates between the estimations of the regression and dispersion parameters until convergence is reached. The convergence is reached when the tolerance of the model is lower than 10−6, where the tolerance for the (r + 1)th iteration is defined as
tolerance(r+1)=
n
X
i=1
h
ηi(r)− η(r+1)i i2
Pn i=1
h
ηi(r+1)i2.
Value
BBmm returns an object of class ‘BBmm’.
The function summary (i.e., summary.BBmm) can be used to obtain or print a summary of the results.
fixed.coef estimated value of the fixed effects of the regression.
fixed.vcov the variance-covariance matrix of the estimated fixed effects of the regression.
random.coef predicted random effects of the regression.
sigma.coef estimated value of the standard deviations of the random effects.
sigma.var variance of the estimation of the standard deviation of the random effects.
phi.coef estimated value of the dispersion parameter of the beta-binomial distribution.
psi.coef estimated value of the logarithm of the dispersion parameter of the beta-binomial distribution.
psi.var variance of the estimation of the logarithm of the dispersion parameter of the beta-binomial distribution.
fitted.values the fitted mean values of the probability parameter of the beta-binomial distribution.
conv convergence of the methodology. If the method has converged it returns ‘yes’, otherwise ‘no’.
deviance deviance of the model.
df degrees of freedom of the model.
null.deviance null-deviance, deviance of the null model. The null model will only include an intercept as the estimation of the probability parameter.
null.df degrees of freedom of the null model.
nRand number of random effects.
nComp number of random components.
nRandComp number of random effects in each random component of the model.
namesRand names of the random components.
iter number of iterations in the estimation method.
nObs number of observations in the data.
y the vector of the outcomes that are going to be modelled as a function of the covariates.
X design matrix composed by the given covariates in the model.
Z design matrix composed by the correlation, or random effects structure, of the model.
D variance-covariance matrix of the random effects.
balanced if the response beta-binomial variable is balanced it returns ‘yes’, otherwise ‘no’.
m maximum score number in each beta-binomial observation.
nDim number of dimensions that are going to be jointly analysed.
call the matched call.
formula the fixed and random supplied formulas. It only provides the formula if it has been previously specified in the arguments of the function.
The first formula corresponds to the fixed part of the model, while the second formula corresponds to the random structure.
Examples
> set.seed(15)
> # Defining the parameters
> nObs <- 500 # 500 observations
> m <- 10# balanced data, maximum score number equal to 10.
> nRandComp <- c(70,50) # number of random effects in each random component
> phi <- 1.1 # dispersion parameter of the beta-binomial distribution
> sigma1 <- 1.2 # standard deviation of the first random effect
> sigma2 <- 0.5 # standard deviation of the second random effect
> beta <- c(-1,3.25) # the fixed effects
>
> # Simulate
> x <- rnorm(nObs,0.5,1.5) # the covariate
> u1 <- rnorm(nRandComp[1],0,sigma1) # first random effects
> u2 <- rnorm(nRandComp[2],0,sigma2) # second random effects
> u <- as.vector(c(u1,u2))