We will take Q15 as a specific example of the type of modification we will make to the difficult items. Fewer than 10% of students answered Q15 cor-
rectly, far below what is acceptable for a CI.
The item covers finding vulnerabilities in a defense and falls under concept Identify Vulnerabilities and Failures (V). The scenario describes a hypothet- ical nuclear treaty between two countries that requires a method of securely transmitting a message from a monitoring device. Neither country trusts the other, and the design must be fair to each country. Both parties want assurances that the message is not modified. Country A wants to ensure that the message originates from the device. Country B wants to monitor the message data in real time. The premise is: “The sender applies a keyed cryptographic hash function to each message using a key distributed only to the sender, Country A, and Country B.” Students are expected to find potential vulnerabilities in the suggested outputs of the device.
The options the students had for Q15 are below.
(a) The message together with a hash of the following: message and current time.
(b) The key together with a hash of the message. (c) The message together with a hash of the message. (d) A hash of the message.
(e) This design cannot satisfy the system requirements
Our distractor analysis revealed that the best students chose Option A more than the correct answer. This finding reveals that, as students’ knowl- edge increased, this wrong answer became more compelling. When con- structed well, each item should lead students to pick the correct answer more often as their knowledge increases.
The preference for Option A is understandable given that it is more rea- sonable than the other three options. Options B and D do not even send the original message so the message cannot be verified. Option A and Option C do not guarantee that the source is sending the message: since each party has the key, it can modify the message and attach a new hash. Because A has the same structure as C with the addition of time being sent, it appears to be strictly superior to C, and thus is the best option. Students must see the problems with each option and select Option E which serves as a “none of the above.” Including a “none of the above” in general makes assessments
harder [24], especially because Options A and C satisfy some of the desired properties.
The problem with the item, and further “none of the above” in general, is that Option E makes no assertion. This fact leads students to pick the most reasonable of the other choices. We have modified this item, changing Option E to make an assertion. The new Option E is “The design does not work because Countries A and B can modify the message.” This new wording provides a definite assertion, which students can check and conclude that the other options do not satisfy the requirements. We anticipate that this change, while minor, will make the item easier and differentiate more students.
After making similar modifications to other items, our next work is to administer the instrument to more students and reanalyze the results. With the easier items, the difficulty will cover a better range and better separate students. The range of difficulties and modification of items that are too difficult should increase the discriminatory power of the CCI and improve the CCI’s validity and usefulness.
CHAPTER 6
CONCLUSION
The purpose of the expert review and pilot test was to evaluate the validity of the CCI. The expert review and pilot testing of the CCI revealed the CCI reliably tests students’ knowledge of cybersecurity. At this point, the CCI could be used as an evaluation instrument, but the scores would be low. The low scores reduce the discriminatory power of the assessment. By making the CCI easier, we will be able to create an assessment that should be broadly applicable and provide useful measurements of a broad range of cybersecurity students. Further research will cover the modifications of the items and testing with more students.
REFERENCES
[1] A. Wirth, “The importance of cybersecurity training for HTM profes- sionals,” Biomedical Instrumentation & Technology, vol. 50, no. 5, pp. 381–383, 2016.
[2] M. Libicki, D. Senty, and J. Pollak, Hackers Wanted: An Examination of the Cybersecurity Labor Market. The RAND Corporation, Jan. 2014. [3] G. Parekh, D. DeLatte, G. Herman, L. Oliva, D. Phatak, T. Scheponik, and A. T. Sherman, “Identifying core concepts of cybersecurity: Results of two delphi processes,” IEEE Transactions on Education, vol. 61, pp. 11–20, Feb. 2018.
[4] T. Scheponik, A. T. Sherman, D. DeLatte, D. Phatak, L. Oliva, J. Thompson, and G. L. Herman, “How students reason about cyber- security concepts.” IEEE Frontiers in Education Conference (FIE), pp. 1–5, Oct. 2016.
[5] A. T. Sherman, D. DeLatte, M. Neary, L. Oliva, D. Phatak, T. Schep- onik, G. L. Herman, and J. Thompson, “Cybersecurity: Exploring core concepts through six scenarios,” Cryptologia, vol. 42, no. 4, pp. 1558– 1586, Sept. 2018.
[6] J. Thompson, G. Herman, T. Scheponik, L. Oliva, A. T. Sherman, and E. Golaszewski, “Student misconceptions about cybersecurity concepts: Analysis of think-aloud interviews,” Journal of Cybersecurity Education, Research and Practice, vol. 2018, no. 1, pp. 1–29, Jul. 2018.
[7] A. T. Sherman, L. Oliva, D. DeLatte, E. Golaszewski, M. Neary, K. Pat- sourakos, D. Phatak, T. Scheponik, G. Herman, and J. Thompson, “Cre- ating a cybersecurity concept inventory: A status report on the cats project,” National Cyber Summit, pp. 1–5, Jun. 2017.
[8] A. T. Sherman, L. Oliva, E. Golaszewski, D. Phatak, T. Scheponik, G. Herman, D. S. Choi, S. Offenberger, P. Peterson, J. Dykstra, G. Bard, A. Chattopadhyay, F. Sharevski, R. Verma, and R. Vrecenar, “The cats hackathon: Creating and refining test items for cybersecurity concept inventories,” in IEEE Security and Privacy, 2019.
[9] R. Hake, “Interactive-engagement versus traditional methods: A six- thousand-student survey of mechanics test data for introductory physics courses,” American Journal of Physics, vol. 66, pp. 64–74, Jan. 1998. [10] D. Hestenes, M. Wells, and G. Swackhamer, “Force concept inventory,”
The Physics Teacher, vol. 30, pp. 141–158, Mar. 1992.
[11] T. Litzinger, P. Van Meter, C. Firetto, L. J. Passmore, C. B. Masters, S. R. Turns, G. L. Gray, F. Costanzo, and S. Zappe, “A cognitive study of problem solving in statics,” Journal of Engineering Education, vol. 99, pp. 337–353, Oct. 2010.
[12] D. Evans, G. Gray, S. Krause, J. Martin, C. Midkiff, B. Notaros, M. Pavelich, D. Rancour, T. Reed, P. S. Steif, R. Streveler, and K. Wage, “Progress on concept inventory assessment tools,” IEEE Fron- tiers in Education Conference (FIE), pp. T4G–1–T4G–8, Nov. 2003. [13] K. Douglas and S. Purzer, “Validity: Meaning and relevancy in assess-
ment for engineering education research: Assessment validity for engi- neering education research,” Journal of Engineering Education, vol. 104, no. 2, pp. 108–118, Apr. 2015.
[14] J. Libarkin, “Concept inventories in higher education science,” National Research Council Promising Practices in Undergraduate STEM Educa- tion Workshop, Oct. 2008.
[15] National Research Council, Division of Behavioral and Social Sciences and Education, Board on Testing and Assessment, Center for Education, Committee on the Foundations of Assessment, Knowing What Students Know: The Science and Design of Educational Assessment, J. W. Pel- legrino, N. Chudowsky, and R. Glaser, Eds. Washington, DC: The National Academies Press, 2001.
[16] B. B. Brown, Delphi Process A Methodology Used for the Elicitation of Opinions of Experts. The RAND Corporation, 1968.
[17] G. Herman, C. Zilles, and M. C. Loui, “A psychometric evaluation of the digital logic concept inventory,” Computer Science Education, vol. 24, pp. 277–303, Oct. 2014.
[18] N. Jorion, B. Gane, K. James, L. Schroeder, L. V. DiBello, and J. Pel- legrino, “An analytic framework for evaluating the validity of concept inventory claims,” Journal of Engineering Education, vol. 104, pp. 454– 496, Oct. 2015.
[19] J. Ryan and F. Brockmann, A Practitioners Introduction to Equating with Primers on Classical Test Theory and Item Response Theory. Dis- tributed by ERIC Clearinghouse, Jun. 2009.
[20] J. Cappelleri, J. Lundy, and R. Hays, “Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures,” Clinical Therapeutics, vol. 36, pp. 648–662, May. 2014.
[21] L. J. Cronbach, “Coefficient alpha and internal structure of tests,” Psy- chometrika, vol. 16, pp. 297–334, Sept. 1951.
[22] P. Panayides, “Coefficient alpha: Interpret with caution,” Europe’s Journal of Psychology, vol. 9, no. 4, pp. 688–696, Nov. 2013.
[23] S. Testa, A. Toscano, and R. Rosato, “Distractor efficiency in an item pool for a statistics classroom exam: Assessing its relation with item cognitive level classified according to blooms taxonomy,” Frontiers in Psychology, vol. 9, no. 1585, pp. 1–12, Aug. 2018.
[24] D. DiBattista, J.-A. Sinnige-Egger, and G. Fortuna, “The none of the above option in multiple-choice testing: An experimental study,” The Journal of Experimental Education, vol. 82, no. 2, pp. 168–183, 2014.