Item Selection Method: MOES - Transition Probability Matrix

Chapter 4 Learning Promotion and Mastery Detection

4.3 Multiple Attribute Case

4.3.1 Transition Probability Matrix

4.3.2.1 Item Selection Method: MOES

The approach of the OPP-based methods is extended to be applied in the assessments that aim to achieve a complete mastery of multiple attributes. Similarly to the OPP-based methods, the multivariate extension again adopts the Bayesian approach in calculating the expected value of the sum of attributes given by past progress and a correct response to a candidate item. By evaluating the sum of the attributes by administering the candidate item, we seek to maximize the gain of

overall skills. The desired and prospective outcome given the most evidential observation, a correct response, is the growth in the skills vector in which it can entail any subset of mastery acquisition. Previously, for the MOPP method, since it only involved the mastery of a single attribute, the evaluation of the candidate item did not require the Q-vector. In fact, the Q-vector of a single dimension was always equal to 1 in such a case. However, as we consider more than a single attribute to be instructed, more complex items administering multiple attributes can yield more diverse mastery outcomes. By incorporating the information given by Q-vector, we calculate if selecting a particular item yields higher expected sum of attributes. Below, we define the Maximum One-step-ahead Expected sum of attributes:

E[X k α(t+1)|y1, ..., yt, yt+1= 1] ∝ K X m=0 m×hL(y1, ..., yt, yt+1= 1| X k αk(t+1) = m)P ( X k αk(t+1) = m) i

Note that the expression of the CDM parameters are implicit in the likelihood L(y1, ..., yt, yt+1=

1|P

kαk(t + 1) = m) and the probability of the sum, P (Pkαk(t + 1) = m). The event {Pkαk(t +

1) = m)} includes every possible path for all attribute patterns that return the particular sum of the attributes: P (X k αk(t + 1) = m) = X {α0:P kαk=m} P (α(t + 1) = α0) = X {α0:P kαk=m} t+1 X n=1 P (α(n) = α0)

Thus, the MOES method can be used likewise as to MOPP method. That is, we calculate the MOES for all candidate items in the item bank and we choose the item with the largest value.

Stopping Rule and Threshold Calibration

The ultimate goal of the assessment of the application we consider is to achieve the complete mastery of K attributes, K > 1. The contrasting design of this study from the previous detection schemes is that there is no fixed order of attributes to be detected. Accordingly, a stopping rule to de- clare a complete mastery is used when the MOES method selects items throughout the assessment. Specifically, we are only interested in the mastery of all attributes.

Let L denote the moment when all targeted attributes are present. After each response is observed, we test the null hypothesis {L > t} for the most up-to-date tth item against the alternative hypothesis {L ≤ t}. The alternative hypothesis {L ≤ t} is composite and for any given m ∈ {1, ..., t}, the likelihood of {l = m} is again the same as the likelihood ratio given in section 4.1.2.2.

4.3.3 Simulation Study

To evaluate the performance of the MOES method, we simulated educational assessments that aim to teach more than one attributes. In order to benchmark the performances of the MOES method, we conducted simulating educational assessments with the use of MOPP method in parallel.

4.3.3.1 Simulation Design

For an examinee population size of 10,000(N ), two factors – the number of attributes(K) and the item bank size(J ) – were varied to compare the OES method to OPP method. The number of attributes considered were 2 and 3 and we considered item bank sizes of 150 and 300, thus we have a 2 by 2 design. The response data and items were generated under the DINA model.

For each simulation design J item parameters for each of the DINA parameters, guessing and slipping probabilities, Q-vector and a transition probability matrices were drawn. The guessing and slipping probabilities were drawn from a Uniform distribution with a range of 0.05 and 0.40. And, for the Q-vector, each vector was weighed with a _{P kqk}kqk for all possible q, q ∈ {0, 1}K\ {0}K_{. Then}

for K = 2, the q-vectors, (0,1) and (1,0), are drawn with a probability of 0.4 and (1,1) with 0.2. Likewise, for K = 3, (0,0,1),(0,1,0),(1,0,0) are drawn with probability of 0.206, (0,1,1),(1,0,1),(1,1,0) are drawn with probability of 0.103, and (1,1,1) is drawn with a probability of 0.069.

The transition probability matrix is generated for each item according to the pattern of the q-vector. For any simple item, there is only one possible attribute to be acquired, however for more complex items that instructs more than one attribute, the transition matrix is generated to allow transitioning to any subset of targeted attributes of the item. The transition probabilities are also weighed with respect to the magnitude of attributes administered.

For any given Q-vector, we sampled the transition probabilities as follows:

P (α(t + 1)|α(t)) =        Unif(0, 1) if α(t) = α(t + 1) Unif(0,_{kα(t+1)−α(t)k}1 2) if α(t) 6= α(t + 1) 4.3.3.2 Simulation Results

Table 4.11 - 4.14 display the mastery(l) and delay statistics(d − l > 0) – arithmetic mean and standard deviation – given by each simulation designs. Through these results, we evaluate the performance of MOES method under varying simulation conditions. Overall results indicate that the availability of MOES method allows us to administer instruction of multiple attributes with

adaptive item selection and thus hasten the process of mastery and detection.

It is important to note that while we compare the MOES and MOPP methods in parallel for all simulation designs, the K attributes were administered simultaneously for simulating with MOES and individually for the MOPP. That is, for the MOES method, the mastery time, l, is simply the number of items until a complete mastery of all K attributes are attained whereas, for the MOPP, method l is sum of the number of items needed to master each attribute.

Table 4.11 and Table 4.12 present the mastery and delay statistics for the simulation design with smaller item bank of size 150. In Table 4.11, two attributes are instructed throughout the assessment(K = 2). For the MOES method, an average of approximately two items were used until mastery was achieved whereas for the MOPP method, a mean of 4.44 items were used until a complete mastery of two items were achieved. Accounting for two isolated detections made the MOPP method is took more items on average than the MOES method until mastery occurred. The detection delay of the MOES method is nearly half of that of the MOPP method. Similar trends in both mastery time and detection delays are shown for the K = 3 simulation. The detection delay for the K = 3 case is larger than that of K = 2 case. This is due to more conservative threshold that was set to meet 0.01 false detection rate. For both cases, it is apparent that using the MOES method nearly halved the entire length of the assessment.

Item-selection Method

Mastery(l) Delay(d − l > 0) Assessment length(d) Mean Std. Dev. Mean Std. Dev. Mean Std. Dev.

MOES 1.99 1.60 2.19 2.04 4.33 3.59

MOPP 4.44 1.80 4.03 1.89 8.20 3.65

Table 4.11: Comparing MOES and MOPP: Mastery and delay statistics (K = 2, J = 150)

Item-selection Method

Mastery(l) Delay(d − l > 0) Assessment length(d) Mean Std. Dev. Mean Std. Dev. Mean Std. Dev.

MOES 2.89 1.98 2.98 2.39 5.66 4.60

MOPP 6.42 2.18 7.55 2.56 13.65 5.23

Table 4.12: Comparing MOES and MOPP: Mastery and delay statistics (K = 3, J = 150)

In Table 4.13 and 4.14, with the item bank sizes of 300 (e.g. J = 300), the means of items until mastery reduced compared to those of larger item banks. With larger item banks, item selection methods were capable of selecting items with greater transition probabilities and smaller guessing and slipping parameters more often. However, the general trend from Table 4.11 and 4.12 persist. By increasing the item bank, the learning times and detection delays had reduced overall.

Item-selection Method

Mastery(l) Delay(d − l > 0) Assessment length(d) Mean Std. Dev. Mean Std. Dev. Mean Std. Dev.

MOES 1.92 1.61 2.11 1.97 4.03 3.02

MOPP 4.06 1.76 3.97 1.85 7.64 3.13

Table 4.13: Comparing MOES and MOPP: Mastery and delay statistics (K = 2, J = 300) Item-selection

Method

Mastery(l) Delay(d − l > 0) Assessment length(d) Mean Std. Dev. Mean Std. Dev. Mean Std. Dev.

MOES 2.61 1.89 2.87 2.21 5.50 4.31

MOPP 5.99 2.00 7.61 2.49 13.33 5.11

Table 4.14: Comparing MOES and MOPP: Mastery and delay statistics (K = 3, J = 300)

Overall, the MOES method yields significantly smaller averages of mastery(l) and delay(d−l > 0) while the standard deviations are generally reduced for the MOES method less dramatically. The number of attributes and the duration until mastery and the longevity assessment are linearly with both methods.

In document Sequential mastery detection and Bayesian learning promotion under cognitive diagnosis models (Page 75-80)