E 2.4 SEN Students
F.2 The Overlap Control Method
215. This maximizing of the overlap between PISA 2012 sampled schools and PISA 2003 sampled schools can occur under several different scenarios, all involving a comparison of the 2003 probabilities of selection with the 2012 probabilities of selection for the same schools on both frames. Scenario A has ALL PISA 2012 probabilities of selection larger than or equal to the school’s PISA 2003 probabilities of selection. Scenario B has ALL PISA 2012 probabilities of selection smaller than or equal to the 2003 probabilities of selection. Scenario C has NOT all PISA 2012 probabilities larger than or equal to the PISA 2003 probabilities--- in other words, a mix
216. The first step before deciding which scenario applies is to link the 2003 PISA frame with the 2012 PISA frame by the national school identifier. For all schools from the 2003 frame that link to the 2012 frame, the 2003 school probability of selection and an indicator about whether or not the school was selected for 2003, are added to the 2012 frame file by ACER and Westat.
217. The next step is to do the explicit strata small school analyses for 2012 to determine explicit strata sample sizes. If the NPM accepts the overall sample size, then the PISA 2012 school probability of selection can be calculated for all schools on the 2012 frame.
218. For each school which was also on the 2003 frame, the 2012 probability of selection is compared to the 2003 probability of selection to determine which of scenarios A, B, or C applies.
F.2.A Scenario A
219. If we have scenario A, so that ALL PISA 2012 probabilities of selection are larger than or equal to those for 2003, then we apply maximum overlap control procedures in the following way. 220. To control overlap, the sample selection of schools for PISA adopts a modification of the
approach due to Keyfitz (1951), based on Bayes Theorem.
221. Assume that PROBI is the PISA 2003 probability of selection, and PROBP is the PISA 2012 probability of selection (where PROBP=MOS/stratum sampling interval), then a conditional probability of selection into PISA 2012, CPROB, is determined to maximize the overlap with PISA 2003, as follows. 222.
(
)
(
)
min 1, max 0, 1 PROBPif the school was a PISA 2003 school PROBI
PROBP PROBI
CPROB if the school was not a PISA 2003 school
PROBI
PROBP if the school was not a PISA 2003 eligible school
− = −
223. Then a conditional MOS variable is created to coincide with these conditional probabilities as follows:
CMOS=CPROB × stratum sampling interval (rounded to 4 decimal places).
224. The PISA 2012 school sample is then selected using the line numbers created as usual (see B.6.3), but applied to the cumulated CMOS values (as opposed to the cumulated MOS values) (see B.3.6). Note that it is possible that the resulting PISA 2012 sample size could be a bit lower or higher than the originally assigned sample size, but this is deemed acceptable.
225. Since in scenario A, ALL PROBP are greater than or equal to PROBI, then the conditional probabilities of selection for the schools also sampled in 2003 will be 1, so these schools will all be selected into the 2012 sample.
226. The PISA 2012 school weights will be calculated as usual for the 2012 sample. Since we have ensured that all schools which were sampled in 2003 and were on the 2012 frame, were also sampled for 2012, then the 2003 school weights can be used for the link sample.
F.2.B Scenario B
227. If we have scenario B, where all PISA 2012 probabilities of selection are smaller than or equal those for 2003, and apply the procedures in paragraphs 214 - 217, this will result in a sample for 2012 in which some but not all of the 2003 sampled schools are sampled for 2012. Additionally, schools on the 2003 frame which were not sampled for PISA 2003, will in this scenario of probabilities, not be sampled for the 2012 sample. Therefore, all the other schools sampled for 2012 will be those not on the 2003 frame. In this particular situation, the PISA 2012 sample can be thought of as a subsample of the 2003 sample plus a sample of new schools.
228. The PISA 2012 school weights will be calculated as usual for the 2012 sample. Since not all 2003 sampled schools are in the link sample, the 2003 weights cannot be used for the link sample. 229. However, the 2012 frame can be thought to consist of two parts: the 2003 frame and new 2012 schools. The 2012 sample can also be thought to consist of two parts: 1) a sample of 2003 sampled schools, and 2) a sample of “new” 2012 schools. The sum of 2012 weights for set 1) schools represents all still existing schools on the 2003 frame. The sum of 2012 weights for set 2) schools represents all the new schools on the 2012 frame which were not on the 2003 frame. Therefore, for the common schools in the link sample, these 2012 weights can also be used. F.2.C Scenario C
230. If we have scenario C, where some PISA 2012 probabilities are larger than or equal those for 2003 and some are smaller than they were for 2003, then the following additional procedure will apply for each school on the 2012 frame. If the 2012 probability of selection is greater than or equal to the 2003 probability of selection, then PROBP remains as PROBP. If the 2012 probability of selection is less than the 2003 probability of selection, then PROBP is assigned the 2003 probability of selection.
231. Since probabilities have been adjusted, these need to be summed within explicit strata to determine revised sample sizes. If the overall sample size which now includes all 2003 schools in the 2012 sample is larger than that already discussed with the NPM after the explicit strata small school analyses, then the NPM needs to make a decision.
232. If then NPM decides that the larger sample is fine, then the 2012 sample will be selected using paragraphs 214-217. Since all PROBP will then be greater than or equal to PROBI, the maximal overlap procedure will ensure that all 2003 schools will be in the 2012 sample. The 2012 sample will use weights calculated in the usual way for the 2012 sample. The link sample will use the 2003 weights since all 2003 schools are included in the link sample.
233. If the NPM decides that the larger sample is not accepted, then we need to revert to paragraphs 214-217, using unadjusted PROBP values and the PROBI values. This will maximize overlap but it will not ensure that all 2003 sampled schools are in the 2012 sample. Nor can it ensure that non-link 2012 sampled schools are restricted to new 2012 schools.
234. The PISA 2012 weights will be calculated as usual for the 2012 sample. For the link sample, since not all 2003 sampled schools are included, the 2003 weights cannot be used to weight the link sample. Since some 2012 sampled schools may be those not sampled in 2003, then the 2012 weights cannot be used for the link sample. Thus, a special set of weights will be needed for the link sample schools, based on the probability of each school being in both the 2003 sample and the 2012 sample. Each school in the link sample would be weighted as 1 / (PROBP * PROBI). Then school nonresponse adjustments would need to be done to account for schools that did not participate in either/or PISA 2003 and PISA 2012. This extra procedure is ONLY required when the NPM cannot accept the larger sample size noted in paragraph 224.