Dose-Response Modeling - Experimentos

plot, we have used a box-plot for the residuals rather than plot them indi- vidually; this will usually be more understandable when there are relatively many points to be put in a single column.

What we see from the side-by-side plot is that the treatment effects are large compared to the size of the residuals. We were also able to see this in the parallel box-plots in the exploratory analysis, but the side-by-side plots will generalize better to more complicated models.

3.10 Dose-Response Modeling

In some experiments, the treatments are associated with numerical levels

such as drug dose, baking time, or reaction temperature. We will refer to Numerical levels or doses

such levels as doses, no matter what they actually are, and the numerical value of the dose for treatmenti will be denoted zi. When we have numer-

ical doses, we may reexpress the treatment means as a function of the dose

zi:

µ + αi= f (zi; θ) ,

whereθ is some unknown parameter of the function. For example, we could

express the mean weight of yellow birch seedlings as a function of the pH of acid rain.

The most commonly used functionsf are polynomials in the dose zi: Polynomial

models

µ + αi= θ0+ θ1zi+ θ2zi2+ · · · + θg−1zig−1 .

We use the power_{g − 1 because the means at g different doses determine} a polynomial of order _{g − 1. Polynomials are used so often because they} are simple and easy to understand; they are not always the most appropriate choice.

If we know the polynomial coefficientsθ0, θ1, . . ., θg−1, then we can de-

termine the treatment meansµ + αi, and vice versa. If we know the poly-

nomial coefficients except for the constant θ0, then we can determine the Polynomials are

an alternative to treatment effects

treatment effectsαi, and vice versa. Theg − 1 parameters θ1 throughθg−1

in this full polynomial model correspond to the_{g − 1 degrees of freedom} between the treatment groups. Thus polynomials in dose are not inherently better or worse than the treatment effects model, just another way to describe the differences between means.

Polynomial modeling is useful in two contexts. First, if only a few of the polynomial coefficients are needed (that is, the others can be set to zero without significantly decreasing the quality of the fit), then this reduced polynomial model represents a reduction in the complexity of our model. For

56 Completely Randomized Designs

example, learning that the response is linear or quadratic in dose is useful, whereas a polynomial of degree six or seven will be difficult to comprehend

Polynomial models can reduce number of parameters needed and provide interpolation

(or sell to anyone else). Second, if we wish to estimate the response at some dose other than one used in the experiment, the polynomial model provides a mechanism for generating the estimates. Note that these estimates may be poor if we are extrapolating beyond the range of the doses in our experiment or if the degree of the polynomial is high. High-order polynomials will fit our observed treatment means exactly, but these high-order polynomials can have bizarre behavior away from our data points.

Consider a sequence of regression models for our data, regressing the responses on dose, dose squared, and so on. The first model just includes the constant θ0; that is, it fits a single value for all responses. The second

model includes the constant θ0 and a linear term θ1zi; this model fits the

responses as a simple linear regression in dose. The third model includes the constantθ0, a linear term θ1zi, and the quadratic termθ2zi2; this model fits

the responses as a quadratic function (parabola) of dose. Additional models include additional powers of dose up to_{g − 1.}

LetSSRkbe the residual sum of squares for the model that includes pow-

ers up to_{k, for k = 0, . . ., g − 1. Each successive model will explain a little} more of the variability between treatments, so thatSSRk > SSRk+1. When

we arrive at the full polynomial model, we will have explained all of the between-treatment variability using polynomial terms; that is, SSRg−1 =

Polynomial improvement SS for including an additional term

SSE. The “linear sum of squares” is the reduction in residual variability

going from the constant model to the model with the linear term:

SSlinear = SS1= SSR0− SSR1 .

Similarly, the “quadratic sum of squares” is the reduction in residual variability going from the linear model to the quadratic model,

SSquadratic = SS2= SSR1− SSR2 ,

and so on through the remaining orders.

Each of these polynomial sums of squares has 1 degree of freedom, because each is the result of adding one more parameter θk to the model for

the means. Thus their mean squares are equal to their sums of squares. In

Testing

parameters a model with terms up through orderk, we can test the null hypothesis that θk = 0 by forming the F-statistic SSk/M SE, and comparing it to an F-

distribution with 1 and_{N − g degrees of freedom.}

One method for choosing a polynomial model is to choose the small- est order such that no significant terms are excluded. (More sophisticated

Model selection

3.10 Dose-Response Modeling 57

Listing 3.3:MacAnova output for resin lifetimes polynomial model.

DF SS MS F P-value ①

CONSTANT 1 79.425 79.425 8653.95365 0

{temperature} 1 3.4593 3.4593 376.91283 0

{(temperature)^2} 1 0.078343 0.078343 8.53610 0.0063378 {(temperature)^3} 1 1.8572e-05 1.8572e-05 0.00202 0.9644 {(temperature)^4} 1 8.2568e-06 8.2568e-06 0.00090 0.97626

ERROR1 32 0.29369 0.0091779 CONSTANT ② (1) 0.96995 {temperature} (1) 0.075733 {(temperature)^2} (1) -0.00076488 {(temperature)^3} (1) 2.6003e-06 {(temperature)^4} (1) -2.9879e-09 DF SS MS F P-value ③ CONSTANT 1 79.425 79.425 9193.98587 0 {temperature} 1 3.4593 3.4593 400.43330 0 {(temperature)^2} 1 0.078343 0.078343 9.06878 0.0048787 ERROR1 34 0.29372 0.0086388 CONSTANT ④ (1) 7.418 {temperature} (1) -0.045098 {(temperature)^2} (1) 7.8604e-05

coefficientsθbidepend on which terms are in the model when the model is es-

timated. Thus if we decide we only needθ0,θ1, andθ2wheng is 4 or more,

we should refit using just those terms to get appropriate parameter estimates.

Resin lifetimes, continued Example 3.7

The treatments in the resin lifetime data are different temperatures (175, 194, 213, 231, and 250 degrees C), so we can use these temperatures as dosesziin

a dose-response relationship. Withg = 5 treatments, we can use polynomials

up to power 4.

Listing 3.3 shows output for a polynomial dose-response modeling of the resin lifetime data. The first model fits up to temperature to the fourth power. From the ANOVA ① we can see that neither the third nor fourth powers are

58 Completely Randomized Designs

significant, but the second power is, so a quadratic model seems appropriate. The ANOVA for the reduced model is at ③. The linear and quadratic sums of squares are the same as in ①, but theSSE in ③ is increased by the cubic

and quartic sums of squares in ①. We can also see that the intercept, linear, and quadratic coefficients change dramatically from the full model ② to the reduced model using just those terms ④. We cannot simply take the intercept, linear, and quadratic coefficients from the fourth power model and use them as if they were coefficients in a quadratic model.

One additional trick to remember when building a dose-response model is that we can transform or reexpress the dose zi. That is, we can build

Try transforming

dose models using log of dose or square root of dose as simply as we can using dose. For some data it is much simpler to build a model as a function of a transformation of the dose.

In document Experimentos (Page 76-79)