The current studies represent an effort to advance the feasibility of CD-CAT, an
intelligent educational measurement tool that was envisioned as enhancing individualized
learning over twenty years ago. On one hand, recent developments in cognitive diagnostic
modeling and CAT have equipped psychometricians with tools they can use to embark on the
development of CD-CAT. On the other hand, the CD assessment component in the PARCC and
Smarter Balanced and the pedagogy issue in Moocs present great opportunities for CD-CAT.
The current studies have focused on the crucial element of CD-CAT: item selection
algorithms. A comprehensive review of item selection algorithms in CD-CAT was conducted.
Several new selection algorithms were proposed to address two important issues in CD-CAT:
measurement efficiency and item exposure control. The PWCAI and PWACDI are
computationally affordable and highly efficient alternatives to other information index-based
algorithms. They can be used as a building block for the development of algorithms to deal with
issues such as item exposure control, content balancing and duel-purpose CD-CAT in CD-CAT.
All of these can develop into interesting future studies.
Although the binary stratification algorithm is a simpler alternative than the information
index-based methods, current research has demonstrated its edge in balancing the item exposure
rates in both fixed-length and variable-length CD-CAT. The stratification method has been well
studied in traditional CAT. It offers an elegant solution to the item exposure control. It also has
the potential to solve item selection problems when multiple constraints must be taken into
75
It appears that the two new proposed approaches in the current studies are competitors,
but this is not necessarily the case, because each of them may be a better fit in different
scenarios. In general, PWCDI and PWACDI are preferred when measurement efficiency is the
top priority, while binary stratification is more advantageous in highly constrained CD-CAT. In
some applications that have multiple constraints, there exists the possibility of using a hybrid
76
REFERENCES
Almond, R. G., DiBello, L. V., Moulder, B., & Zapata‐Rivera, J. D. (2007). Modeling
diagnostic assessments with bayesian networks. Journal of Educational Measurement,
44, 341-359.
American Federation of Teachers. (2014). Moving from 'test and punish' to 'support and improve'
Retrieved March 1st, 2015, from http://www.aft.org/column/moving-test-and-punish-
support-and-improve
Belov, D. I., Armstrong, R. D., & Weissman, A. (2008). A monte carlo approach for adaptive
testing with content constraints. Applied Psychological Measurement, 32, 431-446.
Bolt, D. (2007). The present and future of irt‐based cognitive diagnostic models (icdms) and
related methods. Journal of Educational Measurement, 44, 377-383.
Bunderson, C. V., Inouye, D. K., & Olsen, J. B. (1988). The four generations of computerized
educational measurement ETS Research Report Series. Priceton, NJ: Educational Testing
Service.
Burstein, J. (2003). The e-rater® scoring engine: Automated essay scoring with natural language
processing. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-
disciplinary perspective (pp. 113-121). Mahwah, NJ: Lawrence Erlbaum Associates.
Campione, J. C., & Brown, A. L. (1990). Guided learning and transfer: Implications for
approaches to assessment. In N. Frederiksen, R. Glaser, A. Lesgold & M. G. Shafto
(Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 141-172).
Hillsdale, NJ: Lawrence Erlbaum Associates.
Chang, H.-H. (2012). Making computerized adaptive testing diagnostic tools for schools. In R.
77
history and predictions for the future (pp. 195-226). Charlotte, NC: Information Age
Publishing.
Chang, H.-H. (2014). Psychometrics behind computerized adaptive testing. Psychometrika, 1-20.
Chang, H. H., Qian, J., & Ying, Z. (2001). A-stratified multistage computerized adaptive testing
with b blocking. Applied Psychological Measurement, 25, 333-341.
Chang, H. H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an irt
model. Psychometrika, 58, 37-52.
Chang, H. H., & Ying, Z. (1999). A-stratified multistage computerized adaptive testing. Applied
Psychological Measurement, 23, 211-222.
Chang, H. H., & Ying, Z. (2009). Nonlinear sequential designs for logistic item response theory
models with applications to computerized adaptive tests. The Annals of Statistics, 37,
1466-1488.
Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in
computerized adaptive testing. Journal of Educational Measurement, 40, 71-103.
Chen, S. Y., & Ankenman, R. D. (2004). Effects of practical constraints on item selection rules
at the early stages of computerized adaptive testing. Journal of Educational
Measurement, 41, 149-174.
Chen, S. Y., & Lei, P. H. (2005). Controlling item exposure and test overlap in computerized
adaptive testing. Applied Psychological Measurement, 29, 204-217.
Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: Cd-cat.
78
Cheng, Y. (2010). Improving cognitive diagnostic computerized adaptive testing by balancing
attribute coverage: The modified maximum global discrimination index method.
Educational and Psychological Measurement, 70, 902-913.
Cheng, Y., & Chang, H. (2007). Dual information method in cognitive diagnostic computerized
adaptive testing. Paper presented at the the Annual Meeting of National Council on
Measurement in Education, Chicago, IL.
Cheng, Y., & Chang, H. H. (2009). The maximum priority index method for severely
constrained item selection in computerized adaptive testing. British Journal of
Mathematical and Statistical Psychology, 62, 369-383.
Cheng, Y., Chang, H. H., Douglas, J., & Guo, F. (2009). Constraint-weighted a-stratification for
computerized adaptive testing with nonstatistical constraints balancing measurement
efficiency and exposure control. Educational and Psychological Measurement, 69, 35-49.
Cheng, Y., Chang, H. H., & Yi, Q. (2007). Two-phase item selection procedure for flexible
content balancing in cat. Applied Psychological Measurement, 31, 467-482.
Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: Theory
and applications. Psychometrika, 74, 633-665.
Cooper, S., & Sahami, M. (2013). Reflections on stanford's moocs. Communications of the ACM,
56, 28-30.
DiBello, L. V., Roussos, L. A., & Stout, W. (2006). 31a review of cognitively diagnostic
assessment and a summary of psychometric models. Handbook of statistics, 26, 979-
79
Eggen, T. (2001). Overexposure and underexposure of items in computerized adaptive testing
(measurement and research department reports 2001-1). Amhen, The Netherlands: CITO
Group.
Embretson, S. (1990). Diagnostic testing by measuring learning processes: Psychometric
considerations for dynamic testing. In N. Frederiksen, R. Glaser, A. Lesgold & M. G.
Shafto (Eds.), Diagnostic monitoring of skills and knowledge acquisition (pp. 407-432).
Hillsdale: Lawrence Erlbaum Associates.
Georgiadou, E. G., Triantafillou, E., & Economides, A. A. (2007). A review of item exposure
control strategies for computerized adaptive testing developed from 1983 to 2005. The
Journal of Technology, Learning and Assessment, 5, 4-28.
Gibbons, R. D., & Hedeker, D. R. (1992). Full-information item bi-factor analysis.
Psychometrika, 57, 423-436.
Gierl, M. J. (2007). Making diagnostic inferences about cognitive attributes using the rule‐
space model and attribute hierarchy method. Journal of Educational Measurement, 44,
325-340.
Gott, S. P. (1990). The assisted learning of strategic skills: Comments on chapters 5, 6, and 7. In
N. Frederiksen, R. Glaser, A. Lesgold & M. G. Shafto (Eds.), Diagnostic monitoring of
skill and knowledge acquisition (pp. 173-189). Hillsdale, NJ: Lawrence Erlbaum
Associates.
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of
80
Hartz, S. M. C. (2002). A bayesian framework for the unified model for assessing cognitive
abilities: Blending theory with practicality. (Doctoral dissertation), University of Illinois
at Urbana-Champaign.
Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis. Applied
Psychological Measurement, 29, 262-277.
Henson, R., Roussos, L., Douglas, J., & He, X. (2008). Cognitive diagnostic attribute-level
discrimination indices. Applied Psychological Measurement, 32, 275-288.
Hew, K. F. (2015). Promoting engagement in online courses: What strategies can we learn from
three highly rated moocs. British Journal of Educational Technology. doi:
10.1111/bjet.12235
Hew, K. F., & Cheung, W. S. (2014). Students’ and instructors’ use of massive open online courses (moocs): Motivations and challenges. Educational Research Review, 12, 45-58.
Hsu, C. L., Wang, W. H., & Chen, S. Y. (2013). Variable-length computerized adaptive testing
based on cognitive diagnosis models. Applied Psychological Measurement, 563-582.
Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory: Application to
psychological measurement. Belmont, CA: Dorsey Press.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and
connections with nonparametric item response theory. Applied Psychological
Measurement, 25, 258.
Kingsbury, & Zara, A. R. (1991). A comparison of procedures for content-sensitive item
selection in computerized adaptive tests. Applied Measurement in Education, 4, 241-261.
Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized
81
Knuth, D. (1973). Searching and sorting (Vol. 3). Reading, MA: Addison-Wesley.
Leighton, J., & Gierl, M. (2007). Cognitive diagnostic assessment for education: Theory and
applications. New York, NY: Cambridge University Press.
Leung, C. K., Chang, H. H., & Hau, K. T. (2002). Item selection in computerized adaptive
testing: Improving the a-stratified design with the sympson-hetter algorithm. Applied
Psychological Measurement, 26, 376-392.
Leung, C. K., Chang, H. H., & Hau, K. T. (2003). Incorporation of content balancing
requirements in stratification designs for computerized adaptive testing. Educational and
Psychological Measurement, 63, 257-270.
Liu, J., Ying, Z., & Zhang, S. (2013). A rate function approach to computerized adaptive testing
for cognitive diagnosis. Psychometrika, 1-23.
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer‐adaptive
sequential testing. Journal of Educational Measurement, 35, 229-249.
Mao, X., & Xin, T. (2013). The application of the monte carlo approach to cognitive diagnostic
computerized adaptive testingwith content constraints. Applied Psychological
Measurement, 37, 482-496.
McBride, J. R., & Martin, J. T. (1983). Reliability and validity of adaptive ability tests in a
military setting. New horizons in testing, 223-226.
McGlohen, M., & Chang, H. H. (2008). Combining computer adaptive testing technology with
cognitively diagnostic assessment. Behavior Research Methods, 40, 808-821.
Messick, S. (1989). Validity In R. Linn (Ed.), Educational measurement (3rd edition) (pp. 12-
82
Nichols, P. D., Chipman, S. F., & Brennan, R. L. (1995). Cognitively diagnostic assessment.
Hillsdale, NJ: Lawrence Erlbaum Associates.
Piech, C., Huang, J., Chen, Z., Do, C., Ng, A., & Koller, D. (2013). Tuned models of peer
assessment in moocs. arXiv preprint arXiv:1307.2579.
Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in
computerized adaptive testing. Journal of Educational Measurement, 35, 311-327.
Rosen, K. (2011). Discrete mathematics and its applications (7th edition): McGraw-Hill
Science.
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods,
and applications. New York, NY: The Guilford Press.
Sandeen, C. (2013). Integrating moocs into traditional higher education: The emerging “mooc 3.0” era. Change: The Magazine of Higher Learning, 45, 34-39.
Snow, R. E., & Mandinach, E. B. (1991). Integrating assessment and instruction: A research and
development agenda ETS Research Report Series (Vol. 1991, pp. i-176). Priceton, NJ:
Educational Testing Service.
Steinberg, R. J. (1984). What cognitive psychology can (and cannot) do for test development. In
B. S. Plake (Ed.), Social and technical issues in testing: Implications for test construction
and usage. Hillsdale, NJ: Lawrence Erlbaum Associates.
Stocking, M. L. (1993). Controlling item exposure rates in a realistic adaptive testing paradigm.
Princeton, NJ: Educational Testing Service.
Stocking, M. L., & Swanson, L. (1993). A method for severely constrained item selection in
83
Sympson, J. B., & Hetter, R. D. (1985). Controlling item-exposure rates in computerized
adaptive testing. Paper presented at the the 27th Annual meeting of the Military Testing
Association, San Diego, CA.
Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models.
Journal of the Royal Statistical Society: Series C (Applied Statistics), 51, 337-350.
Tatsuoka, C., & Ferguson, T. (2003). Sequential classification on partially ordered sets. Journal
of the Royal Statistical Society: Series B (Statistical Methodology), 65, 143-157.
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item
response theory. Journal of educational measurement, 20, 345-354.
Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error
diagnosis. Diagnostic monitoring of skill and knowledge acquisition, 453-488.
Tatsuoka, K. K. (1991). Boolean algebra applied to determination of universal set of knowledge
states. Princeton, NJ: Educational Testing Service.
Tatsuoka, K. K. (1995). Architecture of knowledge structures and cognitive diagnosis: A
statistical pattern recognition and classification approach. Cognitively diagnostic
assessment, 327-359.
Tatsuoka, M. M., & Tatsuoka, K. K. (1989). Rule space. In N. L. Johnson & S. Kotz (Eds.),
Encyclopedia of statistical sciences: John Wiley & Sons.
Urry, V. W. (1971). A monte carlo investigation of logistic mental test models. ProQuest
Information & Learning.
US Department of Education. (2015a). Elementary and secondary education act. from
84
US Department of Education. (2015b). Race to the top assessment program. 2015, from
http://www2.ed.gov/programs/racetothetop-assessment/index.html
US Department of Education. (2015c). Race to the top program. 2015, from
http://www2.ed.gov/programs/racetothetop/index.html
van der Linden, W. J. (1999). Multidimensional adaptive testing with a minimum error-variance
criterion. Journal of Educational and Behavioral Statistics, 24, 398-412.
van der Linden, W. J. (2005). Linear models for optimal test design. New York, NY: Springer.
van der Linden, W. J., & Chang, H. H. (2003). Implementing content constraints in alpha-
stratified adaptive testing using a shadow test approach. Applied Psychological
Measurement, 27, 107-120.
van der Linden, W. J., & Reese, L. M. (1998). A model for optimal constrained adaptive testing.
Applied Psychological Measurement, 22, 259-270.
van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized
adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29,
273-291.
Vardi, M. Y. (2012). Will moocs destroy academia? Commun. ACM, 55, 5.
Veldkamp, B. P., & van der Linden, W. J. (2002). Multidimensional adaptive testing with
constraints on test content. Psychometrika, 67, 575-588.
Wang, C. (2013). Mutual information item selection method in cognitive diagnostic
computerized adaptive testing with short test length. Educational and Psychological
85
Wang, C., & Chang, H. (2009). Kullback-leibler information in multidimensional adaptive
testing: Theory and application. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC
Conference on Computerized Adaptive Testing.
Wang, C., & Chang, H. H. (2011). Item selection in multidimensional computerized adaptive
testing—gaining information from different angles. Psychometrika, 1-22.
Wang, C., Chang, H. H., & Douglas, J. (2012). Combining cat with cognitive diagnosis: A
weighted item selection approach. Behavior Research Methods, 44, 95-109.
Wang, C., Chang, H. H., & Huebner, A. (2011). Restrictive stochastic item selection methods in
cognitive diagnostic computerized adaptive testing. Journal of Educational
Measurement, 48, 255-273.
Wang, C., Zheng, C., & Chang, H. H. (2014). An enhanced approach to combine item response
theory with cognitive diagnosis in adaptive testing. Journal of Educational Measurement,
51, 358-380.
Weiss, D. J. (1974). Strategies of adaptive ability measurement. Minneaplis, MN: University of
Mnnesota.
Xu, X., Chang, H., & Douglas, J. (2003). A simulation study to compare cat strategies for
cognitive diagnosis. Paper presented at the the Annual Meeting of National Council on
Measurement in Education, Chicago, IL.
Yi, Q., & Chang, H. H. (2003). A‐stratified cat design with content blocking. British Journal of
Mathematical and Statistical Psychology, 56, 359-378.
Yuan, L., & Powell, S. (2013). Moocs and open education: Implications for higher education.
86
https://www.oerknowledgecloud.org/sites/oerknowledgecloud.org/files/MOOCs-and-
Open-Education.pdf
Yuan, L., Powell, S., & Olivier, B. (2014). Beyond moocs: Sustainable online learning in
institutions. Retrieved from http://publications.cetis.ac.uk/wp-
content/uploads/2014/01/Beyond-MOOCs-Sustainable-Online-Learning-in- Institutions.pdf