Table 6.3 Link qualifiers and functions
Qualifier Link Inverse Link Available with
!IDENTITY η=µ µ=η All !SQRT η=√µ µ=η2 Poisson !LOGARITHM η= ln(µ) µ= exp(η) Normal, Poisson, Negative Binomial, Gamma
!INVERSE η= 1/µ µ= 1/η Normal, Gamma,
Negative Binomial
!LOGIT η=µ/(1−µ) µ= 1
(1+exp(−η))
Binomial, Multi- nomial Threshold
!PROBIT η= Φ−1(µ) µ= Φ(η) Binomial, Multi-
nomial Threshold
!COMPLOGLOG η= ln(−ln(1−µ)) µ= 1−e−eη Binomial, Multi-
nomial Threshold whereµis the mean on the data scale andη=Xτ is the linear predictor on the underlying scale.
ASReml includes facilities for fitting the family of Generalized Linear Models (GLMs, McCullagh and Nelder, 1994). A GLM is defined by a mean variance function and a link function. In this context
y is the observation,
6 Command file: Specifying the terms in the mixed model 109
φ is a parameter set with the!PHI qualifier,
µ is the mean on the data scale calculated using the inverse link function from the predicted value η on the underlying scale where η=Xτ,
v is the variance under some distributional assumption calculated as a function of µand n, and
dis the deviance (-twice the log likelihood) for that distribution.
GLMs are specified by qualifiers after the name of the dependent variable but before the ∼ character. Table 6.3 lists the link function qualifiers which relate the linear predictor (η) scale to the observation (µ=E[y]) scale. Table 6.4 lists the distribution and other qualifiers.
Table 6.4: GLM distribution qualifiers
The default link is listed first followed by permitted alternatives.
qualifiers action
!NORMAL [!IDENTITY | !LOGARITHM | !INVERSE]
allows the model to be fitted on the log/inverse scale but with the
residuals on the natural scale. !NORMAL !IDENTITYis the default.
!BINOMIAL
v=µ(1−µ)/n d= 2n(yln(y/µ)
+(1−y)ln(1−y
1−µ))
[!LOGIT | !IDENTITY | !PROBIT | !COMPLOGLOG] [!TOTAL n]
Proportions or counts [r=ny] are indicated if!TOTALspecifies the
variate containing the binomial totals. Proportions are assumed if no response value exceeds 1. A binary variate [0, 1] is indicated if
!TOTALis unspecified. The expression fordon the left applies when
yis proportions (or binary). The logit is the default link function.
The variance on the underlying scale is π2/3 ∼ 3.3 (underlying
logistic distribution) for the logit link.
!MULTINOMIAL k vij=µi(1−µj)/n fori≤j≤t d= 2nΣk i=1 (yiln(yi/pi)+ whereYi= Σij=1yj µi=E(Yi) andpi=µi−µi−1 ASReml3
!CUMULATIVE[!LOGIT | !PROBIT | !COMPLOGLOG] [!TOTAL n]
fits a multiple threshold model with t = k−1 thresholds to
polytomous ordinal data withkcategories assuming a multinomial
distribution.
Typically, the response variable is a single variable containing the
ordinal score (1 :k) or a set ofk variables containing counts (ri)
in thekcategories. The response may also be a series oftbinary
variables or a series of t variables containing counts. Ift counts
are supplied, the total (including thekth category) must be given
6 Command file: Specifying the terms in the mixed model 110
Table 6.4: GLM distribution qualifiers
qualifier action
The multinomial threshold model is fitted as a cumulative prob-
ability model. The proportions (yi =ri/n) in the ordered cate-
gories are summed to form the cumulative proportions (Yi) which
are modelled with logit (!LOGIT), probit (!PROBIT) or Complemen-
tary LogLog (!CLOG) link functions. The implicit residual variance
on the underlying scale isπ2/3∼3.3 (underlying logistic distri-
bution) for the logit link, 1 for the probit link. The distribution underlying the Complementary LogLog link is the Gumbel distri- bution with implicit residual variance on the underlying svale of
π2/6∼1.65 For example
Lodging !MULTINOMIAL 4 !CUMULATIVE ∼ Trait Variety !r block predict Variety
where Lodging is a factor with 4 ordered categories. Predicted values are reported for the cumulative proportions.
!POISSON
v=µ
d= 2(yln(y/µ)
−(y−µ))
[!LOGARITHM | !IDENTITY | !SQRT]
Natural logarithms are the default link function.
ASRemlassumes the Poisson variable is not negative.
!GAMMA
v=µ2/(φn)
d= 2n(−φln(φyµ)
+φyµ−µ)
[!INVERSE | !IDENTITY | !LOGARITHM] [!PHI φ] [!TOTAL n]
The inverse is the default link function. n is defined with the
!TOTAL qualifier and would be degrees of freedom in the typical
application to mean-squares. The default value ofφis 1.
!NEGBIN v=µ+µ2/φ d= 2((φ+y)ln(µ+φ y+φ) +yln(y µ))
[!LOGARITHM | !IDENTITY | !INVERSE ] [!PHI φ]
fits the Negative Binomial distribution. Natural logarithms are
the default link function. The default value ofφis 1.
General qualifiers
!AOD
ASReml2 Caution
requests an Analysis of Deviance table be generated. This is
formed by fitting a series of sub models for terms in the DENSE part building up to the full model, and comparing the deviances. An example if its use is
LS !BIN !TOT COUNT !AOD ∼ mu SEX GROUP
!AODmay not be used in association withPREDICT.
!DISP [h] includes an overdispersion scaling parameter (h) in the weights. If !DISP is specified with no argument, ASReml estimates it as the residual variance of the working variable. Traditionally it
is estimated from the deviance residuals, reported byASReml as
Variance heterogeneity. An example if its use is
6 Command file: Specifying the terms in the mixed model 111
Table 6.4: GLM distribution qualifiers
qualifier action
!OFFSET [o] is used especially with binomial data to include an offset in the
model where ois the number or name of a variable in the data.
The offset is only included in binomial and Poisson models (for Normal models just subtract the offset variable from the response variable), for example
count !POIS !OFFSET base !DISP ∼ mu group
The offset is included in the model asη=Xτ+o. The offset will
often be something like ln(n).
!TOTAL [n] is used especially with binomial and ordinal data where n is the field containing the total counts for each sample. If omitted, count is taken as 1.
Residual qualifiers control the form of the residuals returned in the .yht file. The
predicted values returned in the .yht file will be on the linear
predictor scale if the!WORKor!PVWqualifiers are used. They will
be on the observation scale if the!DEVIANCE,!PEARSON,!RESPONSE
or!PVRqualifiers are used.
!DEVIANCE produces deviance residuals, the signed square root of d/hfrom
Table 6.4 where his the dispersion parameter controlled by the
!DISPqualifier. This is the default.
!PEARSON writes Pearson residuals, y√−µ
v, in the.yhtfile
!PVR writes fitted values on the response scale in the.yhtfile. This is
the default.
!PVW writes fitted values on the linear predictor scale in the.yhtfile.
!RESPONSE produces simple residuals,y−µ
!WORK produces residuals on the linear predictor scale, y−µ dµ/dη
A second dependent variable may be specified (except with a multinomial re- Revised 08
sponse (!MULTINOMIAL)) if a bivariate analysis is required but it will always be treated as a normal variate (no syntax is provided for specifying GLM attributes for it). The !ASUVqualifier is required in this situation for the GLM weights to be utilized.
6 Command file: Specifying the terms in the mixed model 112
Generalized Linear Mixed Models
This section was written by Damian Collins
A Generalized Linear Mixed Model (GLMM) is an extension of a GLM to in- clude random terms in the linear predictor. Inference concerning GLMMs is impeded by the lack of a closed form expression for the likelihood. ASRemlcur- rently uses an approximate likelihood technique called penalized quasi-likelihood, or PQL (Breslow and Clayton, 1993), which is based on a first order Taylor se- ries approximation. This technique is also known as Schalls technique (Schall, 1991), pseudo-likelihood (Wolfinger and OConnell, 1993) and joint maximisa- tion (Harville and Mee, 1984, Gilmour et al., 1985). Implementations of PQL are found in many statistical packages, for instance, in the GLMM (Welham, 2005) and the IRREML procedures of Genstat (Keen, 1994), the MLwiN pack- age (Goldstein et al., 1998), the GLMMIX macro in SAS (Wolfinger, 1994), and in the GLMMPQL function in R.
The PQL technique is well-known to suffer from estimation biases for some types of GLMMs. For grouped binary data with small group sizes, estimation biases can be over 50% (e.g. Breslow and Lin, 1995, Goldstein and Rasbash, 1996, Rodriguez and Goldman, 2001, Waddington et al., 1994). For other GLMMs, PQL has been reported to perform adequately (e.g. Breslow, 2003). McCulloch and Searle (2001) also discuss the use of PQL for GLMMs.
The performance of PQL in other respects, such as for hypothesis testing, has received much less attention, and most studies into PQL have examined only relatively simple GLMMs. Anecdotal evidence suggests that this technique may give misleading results in certain situations. Therefore we cannot recommend the use of this technique for general use, and it is included in the current version of
ASReml for advanced users. If this technique is used, we recommend the use of cross-validatory assessment, such as applying PQL to simulated data from the same design (Millar and Willis, 1999).
The standard GLM Analysis of Deviance (!AOD) should not be used when there are random terms in the model as the variance components are reestimated for Caution
each submodel.