3.7 The research design
3.7.1 The plan for the research
The research design involved both large scale more quantified data collection and smaller scale more qualitative data gathering.
At the larger scale the responses from the questionnaire items would be coded and quantified for analysis and comparison of questionnaire data with teacher estimates of learner autonomy would be carried out. At the smaller scale I would be
103 autonomy in class: does it serve a useful purpose validly and reliably; does it have the potential to replace, or improve on, or add to teacher estimates?
3.7.1.1
Larger scale data collection
Larger scale data collection was for three purposes: item selection; construct validity checking; and comparison of the questionnaire data with teacher estimates.
3.7.1.1.1 Item selection
After compiling the initial Long List (I refrain from calling the Long List a
questionnaire as it was not the finished questionnaire), the intention was to make it available online and to collect a few hundred responses. Originally, the intention had been to use factor analysis to perform data reduction and thereby form a shorter list of key items for the questionnaire. However, in fact too few responses came in too slowly for this and items were selected using other statistical means (see Section 5.1). Data collection was augmented by paper-based and email-based means to improve the quantity of data available. I also intended to collect feedback from respondents to find any problem items, e.g. those which were not clearly worded.
After item selection the now much shorter list (the ―Short List‖) of items would contain the items which correlated most strongly into factors. This list of grouped items would be used to form a questionnaire designed to measure autonomy. It could potentially become an autonomy measuring instrument.
3.7.1.1.2 Construct validity checking
The Short List would be used to form a questionnaire and more data would be
104 validity to be carried out using factor analysis of the gathered data. The resulting picture of autonomy would be compared with the literature to see whether it was in accord.
3.7.1.1.3 Comparison of teacher estimates and questionnaire data
I originally intended to collect a large amount of teacher estimates to compare them with the corresponding questionnaire data. The purpose of this was to establish the comparative validity and reliability of the questionnaire. The estimates and the questionnaire would be completed at the beginning and end of a course. It would be expected that the teacher estimates would become better with longer exposure to the subject class. This was an assumption since it would appear logical that increased familiarity and knowledge of a class would lead to better estimates. The
questionnaire‘s performance at first and second administrations would not be expected to benefit in any way from the intervening time period in the way that the teachers‘ performance would. The estimates of the teachers would therefore be expected to move towards the level of the learners‘ autonomy over time, but the questionnaire results would not be expected to move towards increased accuracy. This is not to say that the questionnaire results will not vary over time – as seen in the literature review it is accepted widely that autonomy varies (see Section 2.4.2). If the results of the questionnaire and the teacher estimates were to move closer over time, i.e. between the first and second administrations, either viewed for individual learners or in terms of class averages, then it would be very suggestive of a change in the teacher‘s success in estimating autonomy rather than a change in the
questionnaire‘s ―ability‖. Convergence would therefore be a positive result for the questionnaire. If there were divergence it would strongly suggest that the
105 questionnaire was unreliable over time. If the results moved but remained equally separated with no convergence it would suggest that the questionnaire was matching the estimates but with a bias – this would also be suggestive of a positive result since the movement would indicate a change in the autonomy level of the class which was being picked up by both methods of measurement. A recalibration of the instrument in this situation would theoretically be one possible solution to the disparity.
With larger samples it is easier to establish a significant correlation. This would mean that a number of classes and teachers would be necessary for a definite result to be found, either supporting or not supporting the questionnaire as an alternative to teacher estimates. If the required quantity of classes and teachers was not found any results could not be shown to be significant, and could be no more than suggestive as small samples may be idiosyncratic. Problems did mean that only two sets of teacher estimates could be gathered and therefore statistically significant data would not be gathered.