• No results found

5.6.1 Analysis of quantitative data

The quantitative data was analysed using IBM SPSS Statistics version 19.0, and the significance level was decided by taking p-values into consideration where p>0.05 meant there was not a meaningful difference, and p<0.05 meant there was a statistically significant difference.

I started with uses the uses an independent samples t-test in SPSS for the total scores of pre-test for the three topics (food chain topic (FC), circulation and digestive system topic (C&DS) and electrical circuit (EC). Mean scores are compared between each experimental groups with control group and b=compared between both experimental groups and between the two experimental groups in order to measure that students in all groups were equal at the baseline.

With regard to the analysis of the students’ academic achievement results, the independent-samples T- test was used for the post-test for comparing the extent of the effect of ICSS in the post-test of students’ academic achievement to the three topics separately, the reason of this that the simulation program as intervention tool (or treatment tool) is new for both science teachers and students and the expected to the test results of the second topic better than the first topic and the test result for the third topic better than the second topic because getting used to use the program by both science teacher and student. The post-test of students’ academic achievement analysis will be as follows:

- Comparing students’ outcomes (in post-test) between students who used simulation in the computer lab (Exg1) and those who use traditional teaching (Cg)

- Comparing students’ outcomes (in post-test) between students who used simulation in the classroom (Exg2) and those who used traditional teaching (Cg)

- Comparing students’ outcomes (in post-test) between students who used simulation in the computer lab (Exg1) and those who used simulation in the classroom (Exg2)

Effect size (ES) was used to measure impact on measurement tools results which conducted before and after between and within all groups in this study, Effect size is not an isolated test, but rather a complement to previous significant statistics for the hypotheses, the significant statistic for hypothesis (p-value) is not practically significant. Because p-value merely represents the probability that a finding is due to chance, does not reveal the effect size. Therefore, the reason to use this test is to consider the importance of findings aside from statistical significance (see Lecroy & Krysik, 2007).

136

Coe (2002) defines the effect size test as quantifying the size of the difference between two groups, experimental and control.

Akers (2001) defines the effect size testing as:

“Convey the magnitude of difference in standard units between the mean of the experimental group and the mean of the control group. Used in conjunction with sample size, alpha level, and direction of the statistical hypothesis to select a value for power.” (p. 332).

Callahan, & Reio (2006) in simple terms, defined an effect size test as the extent to which the objects of study are different; it is the magnitude of the result the researcher observes in a sample.

There is semi-universal agreement that the reports of statistical procedures, such as null hypothesis significance tests, should be accompanied by an appropriate measure of the magnitude of the effect size, this was confirmed by Boguley (2009). In addition, the American Educational Research Association (AERA, 2006) asks for empirical research that the effect size test is run for every essential statistical result. This is also the stance advocated by Cohen, an expert in statistical power, who argued that “the primary product [or result] of a research inquiry is one of measures of effect size, not p values” (1988, p. 12).

There are many different types of effect size (Huberty, 2002; Lecroy & Krysik, 2007). Huberty (2002) refers to three common types of effect size measures:

1. Effect size through the correlations between variables in a sample;

2. Effect size through standardised difference between the means of two groups; and 3. Effect size through the group overlap; the distributions of two compared groups overlap.

In the current study, the standardised difference between the means of two groups was used for effect size result.

According to Baguley (2009), the standardised mean difference includes two scales: Cohen’s d and Hedge’s g. This study used Cohen’s d, which used a formula i.e., the difference between the mean value for two groups; M1 – M2 (e.g. experimental and control) divided by an estimate of the population standard deviation (Sd1 + Sd2/2). To get an accurate result estimate of standard deviation, the pooled standard deviation used was better than using only standard deviation (Coe, 2004).

137

As the effect size in standardised mean difference indicates the magnitude of treatment affect (or the significant statistic for the p-value), Cohen (1988, p. 25) provided d. values for effect size in standardized mean difference. These are:

 Small effect d≧0.2

 Medium effect d≧0.5

 Large effect d≧0.8

In the current study, the effect size for standardized mean difference is conducted as follows: A. Between groups, the effect size of the post-test results of students’ academic achievement:

- Between simulation in the computer lab group (Exg1) and the traditional teaching group (Cg); and

- Between simulation in the classroom (Exg2) and the traditional teaching group (Cg).

B. Within groups, the effect size of the pre-test versus post-test results of students’ academic achievement was evaluated as follows:

- Within simulation in the computer lab group (Exg1) versus the traditional teaching group (Cg); separately for each lesson

- Within simulation in the classroom (Exg2) versus the traditional teaching group (Cg).

With regard to analysis of students’ understanding of the objectives of the lesson, the pre- and post-test test score was divided into four levels of understanding (low, medium, good and very good) according to science teachers’ categorisation - i.e. teacher test correction. Then, students’ test scores were distributed across the four levels of understanding. Frequency and percentage table was used to compare the distribution of students in each level of understanding, using pre- and post-test scores for each experimental (Exg1 & Exg2) separately and comparing with control group.

For conceptual change, questions for certain concepts were selected to compare the performance of the students in the pre-test and post-test with regard to their ability to grasp the concept. Some of these topics were topics where students are known to have misconceptions

With regard to the usability questionnaire, the Likert-scale was evaluated (Strongly Agree = 5, Agree= 4, Neither Agree nor Disagree = 3, Disagree = 2, Strongly Disagree = 1) and then, the frequency and

138

percentage table was produced for each item on the questionnaire. The aim was to calculate the mean (average) of each item and compare the value of the item to the cut-off values shown in the table. This served as data verification for each item as follows: if the item mean (or average) was between 5 and 4, the students’ responses were very positive on this item, and if the item mean was between 3.99 and 3, the students’ responses were positive, and from 2.99 to 2 negative, and finally from1.99 to 1 is very negative.

5.6.2 Analysis of qualitative data

The qualitative measurement tool in this study was the interview. Analysis means interpreting the information provided by the informant and relating it to the main objectives of the study. All the science teachers refused to use tape recorders; thus, information was gathered by writing notes manually (i.e. hand-written notes) instead of tape-recording.

Science teacher interviews were conducted in two parts; First interview was before simulation intervention started - the aim to request the personal information of science teachers, in addition to their attitude in general toward ICT in education. Second interview was conducted after finished for the experimental - i.e. after finished of simulation intervention - the aim was to explore teachers’ impressions and opinions of the use of the ICSS program in science teaching. The interview for each teacher lasted for nearly, but- not more than - 30 minutes.

While the students’ interviews were held in a meeting room of the school, and recording was not used in order to allow students to talk freely. These interviews aim to examine their post-test scores and their responses to questionnaires – i.e. their attitude toward science and simulation program usability in science teaching and learning - by asking how and why, as open questions, and observing the students’ gesticulations is very important during the interview for analysis. School management provided the phone numbers of parents to make them familiar with this issue and to help parents trust to the interviewer and allow an interview with their children to learn their opinions about the usage of the simulation program. The interview would not be more than 15 minutes in length.

To conduct the interview analysis, I started with identifying desired topics from the interview; an example being teachers’ trends and opinions about benefits of ICT in the education process in general and for science education in particular; students’ feelings while using ICSS according to their

observations during lessons. And for the students’ interview themes: student trends in technology, their opinion of the ICSS program in regard to usage as both a user and as an educational tool for lesson

139

display. Then after I finished the interview, I read the interviews several times to write down any impressions from the data that may be relevant to each topic or theme, and which may be useful later was conducted. After that distribution of responses for both science teachers and students under each topic or theme has been conducted. Finally, the responses for both science teachers and students were shown using the narrative (transcript) method under each topic or theme, taking into account the need to show the participant responses in a coherent and sequential form to achieve the objective of the interview.