Population and Sampling Design - Modeling Science Achievement Differences Between Single-sex an

3. METHODOLOGY

3.3. Population and Sampling Design

This study includes a secondary analysis of the TIMSS datasets from Hong Kong and New Zealand gathered during the 1995, 1999 and 2003 administrations at the 8th- grade. The rationale for using 8th -grade only is two-fold: first, gender differences are

much less pronounced at the 4th -grade, and second, TIMSS 1999 administered the survey

at 8th -grade only.

In IEA’s terminology, the target population for all countries is known as the “international desired population,” while the “national defined population” was in fact used by each country for sampling purposes (Foy & Joncas, 2000, pp. 30-31). In cross- country comparison studies, the target population can be defined in relation to the age of students or to the grade they attend. Foy, Rust and Schleicher (1996) explain that “an age-based definition focuses on a specific age cohort” (p. 4-3) while the “grade-based definition focuses on a specific grade” (p. 4-3). However, the complexities of the

education systems in all the countries participating in the assessment made it too difficult to find a comparable grade, so the age-based definition was adopted.

In 1995, the international desired target population was defined as follows: “All students enrolled in the two adjacent grades that contain the largest proportion of 13- year-old students at the time of testing” (Foy et al., 1996, p. 4-1).

The international desired target population had the same definition in TIMSS 1999 and 2003 as follows: “All students enrolled in the upper of the two adjacent grades that contain the largest proportion of 13-year-olds at the time of testing” (Foy & Joncas,

2000, p. 30, Foy & Joncas, 2004, p.110). This grade level was intended to represent eight years of schooling, counting from the first year of primary or elementary schooling, and was the 8th grade in most countries. The upper grades of the TIMSS 1995 population definitions were intended to correspond to TIMSS 1999 and 2003 8th-grade target populations (Foy et al., 1996, Foy & Joncas, 2000, Foy & Joncas, 2004). With this design, TIMSS allowed countries participating in 1995, 1999, and 2003 to gather trend data.

Some schools and students within schools were excluded from the national defined populations. Criteria for exclusion included: schools in geographically remote regions, extremely small schools, schools for students with special needs, and disabled students in regular schools (Foy et al., p. 4-5, Foy & Joncas, 2000, p. 31, Foy & Joncas, 2004, p.110). In all three assessments, countries made efforts to keep the percentage of excluded schools at a very low level.

The sampling design used in TIMSS was described in the technical

documentation provided by the TIMSS and PIRLS International Study Center (ISC) for each TIMSS cycle (Foy et al., 1996, Foy & Joncas, 2000, Foy & Joncas, 2004). In each country, TIMSS uses a “two-stage stratified cluster” (p. 113) sampling procedure. In the first stage, schools were selected from the list of all schools with 8th-grade students using “probability proportional to size sampling (PPS) techniques” (p.118). In the second stage, one intact mathematics classroom was randomly selected per each selected school. These sampling procedures were used in all three TIMSS administrations.

Foy and Joncas (2004) explain why TIMSS sampling design samples intact mathematics classrooms, instead of both mathematics and science classrooms:

At 8th grade, however, classrooms are

usually organized by subject - mathematics, language, science, etc. - and it is more difficult to arrange classroom sampling. TIMSS has addressed this issue by choosing the mathematics class as the sampling unit, mainly because classes often are organized on the basis of mathematics instruction and because mathematics is a central focus of the study. (p. 114)

The quote alludes to the fact that, in some countries, science is offered as separate subjects, i.e., biology, chemistry, etc. while in others as a general or integrated topic. Hence, sampling of science classes would have been a lot more complicated, and selection of mathematics classes was preferred.

The education system of any country is usually structured in a hierarchical order, such that students are nested in classes, classes are nested in schools, and schools nested in countries. In the TIMSS sampling design described above, in majority of countries there was only one classroom sampled from each school, therefore the class and school levels can be considered as one. One important aspect of such a hierarchical structure is that students from the same class/school have more characteristics in common than

students randomly sampled from the whole population of eight-graders in a country. For example, they have been taught by the same science teacher, have accessed the same school resources, etc. This grouping effect or within-class similarity is measured by the intra-class correlation coefficient, and has important consequences for the collection and analysis of data from clustered samples.

Determining what is the grouping variable in the analysis of students’ science achievement using TIMSS data is a complex question that may be answered on a country-by-country basis. According to TIMSS international reports (Martin & Kelly, 1996; Martin, Gregory & Stemler, 2000; Martin, Mullis & Chrostowski, 2004), both countries used as jurisdictions in this study offer an integrated science curriculum at 8th

grade. This means that in these two countries, all sciences are taught as one subject - just like mathematics- as opposed to the case of countries that offer the science curriculum as separate subjects (i.e., physics, chemistry, biology etc.). It is very likely that in countries with integrated science curriculum there is one science teacher for each class and, therefore, all students from an intact mathematics classroom were taught by the same science teacher. Hence, due to the specificity in science curriculum in these two countries, it is safe to assume that an intact mathematics classroom per school is equivalent to one intact science classroom per school and the thus the classroom and school levels are the same. In the exploratory analysis conducted at the beginning of the study, the assumption of one science teacher linked to an entire classroom has been verified for New Zealand and Hong Kong in all three TIMSS cycles.

The experts that developed TIMSS sample design have taken into account intra- class correlation coefficient in designing the sampling frame for each country. The value of the coefficient for each country “was estimated from previous cycles of IEA’s TIMSS, PIRLS, or from national assessments, and in the absence of these sources, an intra-class correlation of 0.3 was assumed” (Foy & Joncas, 2004, p. 115).

The intra-class correlation coefficient is inversely related to the concept of

effective sample size. TIMSS technical reports document that an effective sample size of 400 students has been the basis for their sampling precision, such that “all student

samples should yield sampling errors that are no greater than would be obtained from a simple random sample of 400 students” (Foy & Joncas, 2004, p. 114).

In document Modeling Science Achievement Differences Between Single-sex and Coeducational Schools: Analyses from Hong Kong, SAR and New Zealand from TIMSS 1995, 1999, AND 2003 (Page 81-85)