Multidimensional Mastery Testing - Simulation Considerations

3.2 Simulation Considerations

4.1.3 Multidimensional Mastery Testing

Multidimensional computerized mastery testing (MCMT) requires algorithms to determine when an examinee’s latent trait is located within a pre-specified region of multidimensional space. These regional definitions can also be used to determine the optimal item selection rules for differentiating two examinees slightly within each region. For example, Chapter 3 shows that items should be selected primarily based on the cut- point separating categories. Because the multidimensional compensatory IRT model, as defined in Equation (4.1), is similar in form to the 3PL model, one should also pick multidimensional mastery items based on the boundary between mastery and non-mastery. Very little work has extended mastery testing to multidimensional problems. The first paper to discuss multidimensional mastery testing, Spray et al. (1997), quanti- fied mastery based on a minimally competent percentage of correct responses, p0 = P

then θ0 divides the latent space into two regions: a mastery region, in which the percentage of correct responses is typically greater than p0, and a non-mastery region, in which the percentage of correct responses is typically less than p0. As Spray et al. (1997) noted, the values of θ0 that satisfy p0 = P_jpj(θ0) define a curve in RK, where K is the dimension of θ. To illustrate the passing region described by Spray et al. (1997), I generated J = 40 parameters to fit a two-dimensional C-MIRT model with ¯a1 = .81 (sa1 = .59), ¯a2 = .84 (sa2 = .61), ¯d = −.53 (sd = .82), and c = 0. I then determined

(θ1, θ2) pairs that would result in average, model-predicted probabilities of p0 = .4, p0= .6, and p0= .8. The resulting classification bound functions are plotted in Figure

4.1. Note that for p0 = .4, and p0= .8, the threshold functions define non-linear curves in two-dimensional space.

As originally proposed by Spray et al. (1997), constructing these constant probability classification bounds requires an unchanging set of parameters and a fixed model. Different models will yield different mastery regions. One could, of course, define the mastery region based on a test set of items and then interpolate a curve between those points to use with alternative item banks or IRT models. However, Glas and Vos (2010) argued that the passing region should not necessarily be directly related to the un- derlying model. According to Glas and Vos (2010), “the choice of compensatory or non-compensatory model is an empirical matter, ... [whereas] the choice of ... [classification region] is a value judgment determined by the opinion of who can be qualified as a master” (p. 429). In other words, responses to mathematical comprehension items might (empirically) be determined by a linear combination of reading and computational abilities, but examinees might still need sufficient ability on both dimensions to qualify as a master. Disconnecting the mastery decision from the item response function, Glas and Vos (2010) defined two types of classification procedures. A non-compensatory (or conjunctive) classification procedure requires examinees to be above a threshold on all

θ1

θ2

p0 = .4

p0 = .6

p0 = .8

Figure 4.1: Classification bound functions assuming a minimal, constant, model- predicted probability for passing the test. Probabilities were generated using the two- dimensional C-MIRT model with ¯a1 = .81, ¯a2= .84, ¯d = −.53, and c = 0.

dimensions to qualify as a master. An example of a two-dimensional, non-compensatory classification task is provided in Figure 4.2. One could modify non-compensatory classification regions for use in diagnostic classification modeling by requiring the posterior probability of an examinee on each of the required attributes to exceed some threshold. Conversely, a compensatory classification procedure requires a linear combination of an examinee’s traits to be above a threshold for the examinee to qualify as a master. An examinee of a two-dimensional, compensatory classification task is provided in Figure

4.3.

Glas and Vos (2010) proposed compensatory and non-compensatory classification regions for constructing loss functions in multidimensional space. Once loss functions were defined, they used Bayesian decision theory to both select items and make classification decisions and found that multidimensional CMT improved over a unidimensional analogue as the correlation between the dimensions decreased.

The most recent conception of multidimensional CMT was described by Seitz and Frey (2013). As in Spray et al. (2011), Seitz and Frey (2013) were unable to generalize the SPRT stopping rule without severe restrictions on the item bank and the classification function. For instance, Seitz and Frey (2011) chose an item bank with between-item unidimensionality. Because they assumed that each item only loaded on one dimension, they simplified the classification task by comparing every θ0k+ δ against θ0k− δ, where θ0k is the cut-point for dimension k. Therefore, the SPRT described by Seitz and Frey (2013) contrasts the specific hypotheses: H0 : θi= θ0− δ and H1 : θi= θ0+ δ. Because the point hypotheses are the same for all examinees and all items, these authors avoid constructing a mastery region or considering the distance between each examinee’s trait level and the border of that region. Moreover, as shown in Figures 4.2and 4.3, testing θ0 + δ against θ0 − δ would be consistent with non-compensatory, compensatory, or a variety of other classification bound functions. Additionally, Seitz and Frey (2013)

θ1

θ2

Figure 4.2: A diagram of a non-compensatory classification task. An examinee is required to be in the shaded, green box (upper-right) to be considered a master and, therefore, must be above the threshold on both dimensions.

θ1

θ2

Figure 4.3: A diagram of a compensatory classification task. An examinee is required to be in the shaded, green box (upper-right) to be considered a master. However, for this task, a sufficiently high ability on one dimension would compensate for a low ability on the other dimension.

avoided developing or using item selection algorithms appropriate for mastery tests and, instead, used an algorithm synonymous with maximizing Fisher information at the cur- rent ability estimate. Thus, in the remaining sections of this chapter, I propose modified SPRT-based stopping rules and item selection algorithms appropriate for determining whether examinees are within regions of multidimensional space.

In document Multidimensional mastery testing with CAT (Page 77-83)