3.2 PART 2 – SIMULATION STUDY
4.1.4 Posterior Predictive Model Checking
4.1.4.2 Total Score Distribution
The IRTree Model, the MNRM, and the MPCM were analyzed for test-level model fit using discrepancy measures related to the total score. The total score may capture the pattern of extreme response tendencies. Individuals with a high presence of the substantive trait and a high tendency
to select extreme responses are likely to select 4-Strongly Agree. These individuals will tend to have extremely high total scores. Individuals with a low presence of the substantive trait and a high tendency to select extreme responses are likely to select 1-Strongly Disagree. These individuals will tend to have very low total scores. Respondents with moderate degrees of extreme response tendencies will have total scores near the middle of the total score distribution. Overall, a model that captures the patterns of the total score will likely capture some properties of extreme response tendencies.
Comparison of the observed and model-predicted total test score distributions is a test-level method of analyzing model fit. The mean and the standard deviation of the total score distribution are discrepancy measures used to analyze test-level fit. The PPP-value for each is the proportion of instances in which the statistic (mean or standard deviation) is greater in the replicated datasets than the value of the statistic in the observed total score distribution.
Histograms are used to visualize the replicated total score distribution means and standard deviations. The mean histogram plots the frequency of total score means from the 2000 replicated datasets. A vertical line representing the mean of the observed total score distribution is overlaid on the histogram. The standard deviation histogram plots the standard deviation of total scores from the 2000 replicated datasets. The vertical line overlaid on the histogram signifies the standard deviation of the observed total score distribution.
Figure 18 displays the total score mean and total score standard deviation histograms produced using replicated data under the IRTree Model. The means of the predicted (replicated) total scores are consistently greater than the observed mean total score. This pattern is quantified with a PPP-value equal to 1. In other words, 100% of the predicted total score distributions had a mean greater than the mean of observed total scores. This is an indication that the data do not
adequately fit the IRTree Model. The standard deviation of observed total scores is located on the upper tail of the distribution of the predicted total score standard deviations. This is enumerated by a PPP-value of 0.01. 99% of predicted total score distributions had a standard deviation less than the observed total score standard deviation. This indicates the data does not adequately fit the IRTree Model. Overall, based on these discrepancy measures, the IRTree Model overestimated the mean total score while underestimating the spread.
Figure 19 displays the histogram of predicted mean total scores and the histogram of predicted total score standard deviations computed under the MNRM. The observed mean total score is located at the center of the histogram of the predicted mean total score distribution. This evidence of test-level adequate fit is quantified by the PPP-value of 0.49. Approximately half of predicted mean total scores were above the observed total score mean. 100% of the predicted total score distributions had a standard deviation higher than the standard deviation of the observed total scores. The MNRM properly captured the pattern of the total score mean but overestimated the standard deviation of the total scores.
Figure 18. Predicted mean total score distribution and predicted standard deviation of total score distribution under the IRTree Model
Figure 19. Predicted mean total score distribution and predicted standard deviation of total score distribution under the MNRM
Figure 20 displays the distribution of total score means from the predicted responses and the distribution of total score standard deviations from the predicted responses under the MPCM. The observed total score mean is positioned in the center of the predicted total score distribution means. A PPP-value equal to 0.51 is evidence of adequate test-level model fit. The standard deviation of the observed total scores is neighboring the center of the distribution of predicted total score standard deviations. 61% of predicted total score distributions had a standard deviation greater than the observed total score standard deviation. The MPCM provided adequate fit related to the mean total score and the standard deviation of the total scores.
Figure 20. Predicted mean total score distribution and predicted standard deviation of total score distribution under the MPCM
A histogram of predicted total score means and a histogram of predicted total score standard deviations were developed under the GR model for baseline comparison. Figure 21
model. The GR models captured the mean of the total score distribution (PPP-value = 0.37) but not the standard deviation (PPP-value = 0.99). The PPP-value for mean total score calculated under the MNRM and the PPP-value for mean total score calculated using the MPCM were closer to 0.50 than the PPP-value for mean total score calculated under the GR model. The GR model captured observed total score mean and observed total score standard deviation more effectively than the IRTree Model. The MPCM captured observed total score standard deviation better than the GR model.
Figure 21. Predicted mean total score distribution and predicted standard deviation of total score distribution under the GR Model
Overall, the MNRM and the MPCM provided adequate test-level model fit when the mean total score is used as a discrepancy measure. When the standard deviation of total scores was used as a discrepancy measure only the MNRM provided adequate model fit. The IRTree model did
not provide any advantage of model fit over the unidimensional GR model when the mean total score and the standard deviation of total scores were considered as discrepancy measures.