The observed throughput in an introductory programming course

Investigative Research Methodology 6.1 Introduction

H 0.2 The observed throughput in an introductory programming course

is independent of programming notation and development environment.

6.3.2 Participant Population and Sample Size

In an accurate experimental design, attention is required to be paid to the manner in which participants for the study are selected (Whitley 1997). The design of the current study requires the ability to identify the participants from within the student body registered for the introductory programming course for the 2003 academic year, the participation for which UPE’s Ethic’s Committee gave approval in the previous academic year (Appendix G).

The students that are considered as potential participants are historically first year students, this deliberation being in line with that of prior similar research at UPE (Calitz 1997; Greyling 2000). Consequently, no repeating registrations or registrations after the first year of registration at UPE are included as participants. Repeating students as well as those students who are not historically first year students, even though first time registrations for the introductory programming course, cannot be compared with students entering the university for the first time. This is due to the existence of a certain amount of bias due to exposure to a tertiary level learning environment in the former category of students.

Similar prior studies in programming have indicated that introductory course programmers can be identified as a heterogeneous population (Whitley 1997). The experience level can vary from first exposure with computers to highly competent. This fact can create difficulties when effects of independent variables are hidden by the variances within the participant population. A further factor influencing the identification of potential participants in the study is that if students are requested to volunteer for a programming exercise, anyone who perceives themselves as being poor in programming is unlikely to choose to participate (McCracken et al. 2001). Consequently, a particular method is applied in the design of the experiment forming the focus of the current investigation, specifically in connection with the identification of the participants.

In the current study, the participants are preferably required to be randomly assigned to treatment and control groups resulting in a between-groups experimental design (Whitley 1997). In an educational environment like UPE, the random assignment of

participants to groups is impractical since UPE is a campus where students, in many cases, and especially in the Department of CS/IS, decide their own course timetable for learning activities.

Consequently, for the purposes of the current investigation, allowance is made for the use of control and treatment groups being acknowledged as independent course sections, as is reported in a similar study (Applin 2001). All historically first year registered students for the introductory course are obligated to take part in the study in either the control or treatment populations. While neither the participants nor the treatment is randomly assigned, each of the control and treatment groups is stratified sample based (Berenson et al. 1999).

The preference for a stratified sample based analysis is based on the fact that this type of sample provides an efficient way of ensuring a representation of students across the entire population. In turn, this ensures greater precision in the estimates of the underlying population parameters and also ensures homogeneity of students within each stratum. The stratified samples for the current study are based on discrete pre- test measures within the independent course sections, namely the control and treatment groups.

The discrete pre-test measures are the results, or predicted marks, obtained upon the participants undertaking UPE’s placement test, the background and implementation of which was discussed in Chapter 4 (Section 4.3). The pre-test applied to the participants in the study measures a related aspect of the learning material for an introductory programming course but not knowledge of the material itself. The pre- test is considered valid and reliable in that the test itself has been researched and studied over a period of time and statistical measures are available for it (Greyling 2000; Greyling et al. 2002; Greyling et al. 2003).

The participants of each of the control and treatment groups are grouped according to strata based on the discrete pre-test measure of predicted mark for the introductory programming course at UPE, the characterization of the strata being described in Definition 6.1. A historically first year student registered for the introductory programming course in either of the control or treatment populations is a participant

of a particular stratum if the student’s predicted mark falls within the range specified for the relevant stratum. Each stratum is uniquely identifiable within the control and treatment populations. Each participant is similarly uniquely identifiable within the entire introductory programming course population.

Participant j ∈ Stratumim ⇔

pre-test measure(Participant j) ∈ [36+5i … 40+5i] ∧ experimental group(Participant j) = m,

where i = 1, 2, …, 12 and m ∈ {treatment group, control group}

Definition 6.1: Membership of Participants in Strata

Due to a minimum pre-test measure of 40% (predicted mark) being applied in the selection and placement model at UPE for the introductory programming course (Greyling et al. 2003), the current empirical analysis consequently comprises of a total of 12 strata per experimental group. These stratum identifiers together with appropriate discrete pre-test measure ranges are listed in Table 6.6.

Stratum Identifier

(m ∈ {treatment group, control group}) Predicted mark range

Stratum1m 41, 42, …, 45 Stratum2m 46, 47, …, 50 Stratum3m 51, 52, …, 55 Stratum4m 56, 57, …, 60 Stratum5m 61, 62, …, 65 Stratum6m 66, 67, …, 70 Stratum7m 71, 72, …, 75 Stratum8m 76, 77, …, 80 Stratum9m 81, 82, …, 85 Stratum10m 86, 87, …, 90 Stratum11m 91, 92, …, 95 Stratum12m 96, 97, …, 100

Table 6.6: Strata in Empirical Study

The sample size (n) of the participants in each of the treatment and control groups in the current investigation is equal in value and given by Definition 6.2. To maintain a

In document A comparison of programming notations for a tertiary level introductory programming course (Page 197-200)