Data analysis - RESEARCH DESIGN AND METHODOLOGY

CHAPTER 3: RESEARCH DESIGN AND METHODOLOGY

3.13 Data analysis

Data analysis is a step through which the collected data is tested, evaluated and changed into a functional format to accomplish the investigation, Miles et al. (2013). The researcher has utilised various research methods for assessing how the privatisation of British railways affected the passenger satisfaction in London Overground. The research hypotheses were tested with the survey results. Survey responses accumulated from 200 passengers/staff of London Overground comprised the primary data of the present investigation. For analysing these gathered data from the London Overground passengers, the researcher used pie charts, bar charts and tables. The data represented in these charts were later analysed descriptively and required results were gotten. Along with the primary data the researcher used the secondary information to achieve detailed knowledge of the research topic. The secondary data was used to obtain the research objectives, and it was cross compared with the survey results. Moreover, this research uses a descriptive analysis approach for cross comparing the primary and secondary data and thus to determine the passenger satisfaction and the British railway privatisation in London Overground.

Three important concepts the researcher used to gather and process the data obtained from the investigation include Multinomial logistics regression, Cronbach alpha and Correlation analysis. Multinomial logistics regression is generally termed as the multinomial regression. The tool is used to obtain a nominal dependent variable from the set of one or more independent variables (Li et al., 2010). Moreover Bertens et al., (2016) opines that the multinomial logistics regression is often considered as an extended binomial logistics regression. Multinomial logistics regression is commonly used in situations where the dependent variables tend to be nominal with two or multiple levels.

Before using the multinomial logistics regression on your data, it is necessary to evaluate the feasibility and whether the data can be processed using the multinomial logistics regression (Jostins and McVean, 2016). For assessing the possibility of the approach, it is essential to go through the six assumptions. Even though the six assumptions are time-consuming as it takes some extra procedures the analysis will help in examining the probability of getting a valid result (Bertens et al., 2016). Moreover, while individually performing the SPSS statistics on the data it may not satisfy all the six assumptions. The six assumptions linked with the multinomial logistics regression are as follows:

2. The data will be employing one or more variables that are independent and that are nominal, continuous and nominal.

3. Having an independent observation of the data is essential. The dependent variables must be of exhaustive categories and mutually exclusive.

4. No occurrences of multicollinearity (two or more variables with high correlation). 5. The existence of a linear relationship within the independent and continuous variables. 6. No occurrences of high leverage values, highly influential points and outliers.

Cronbach alpha is a tool of measure that is used to assess the link between the elements in a single group (Bonett and Wright, 2015). Further, from the observation of Cho and Kim (2015), the Cronbach alpha is regarded as an evaluation of the reliability and validity of the data. The Cronbach alpha is commonly used in a situation where the survey questionnaire is including multiple questions based on the Likert scale, and it is necessary to determine the reliability of the scale. The Cronbach alpha was developed by Cronbach Lee to achieve the results of an objective way of measuring the internal consistency reliability of an instrument that is used in a research study. In addition to this Bonett and Wright (2015) opines that the Cronbach alpha is primarily utilised in cases where the research is being carried out with the influence of multiple-item measures of a concept. The equation for obtaining the Cronbach alpha is as follows:

a = kr I (l + k-l) r

Where k=number of items in a group r = mean of the inter-indicator correlation

Further from the notion of Koo and Li (2016) the value for Cronbach alpha is usually expressed as a number between 0.00 and 1.0. A value 1.0 is indicating that the data has a perfect consistency. Whereas a 0.00 value reflects that the data is not holding any amount of consistency. Along with this Koo and Li (2016) opines that in the case of exploratory research the value 0. 70 is acceptable.

According to the observation of Liu et al. (2003), correlation is defined as the degree of association or connection existing between the research variables. The correlation analysis is, therefore, the statistical evaluation that is used for understanding how strong the relationship between the research variables is. Hence the correlation analysis is beneficial in cases when it

ts essential for the researcher to establish a connection within the research variables. However, if it is possible to obtain the correlation between the research variables it reflects that one of the research variables is changing, then it will be influencing the other in the same manner.

Koo and Li (2016) elaborates further that the correlation value can be either positive or negative, simple, partial and multiple correlations and linear and non-linear or curvilinear correlation. The negative correlation value indicates that while one variable is increasing the other will be decreasing. Whereas in the positive correlation value both the variables will be increasing or decreasing simultaneously. Similarly, the simple, partial and multiple correlations indicate that the simple correlation is that when the two variables are taken into study. Whereas the partial correlation exists when either one of the variables is chosen for the study. The utilisation of multiple variables is referred to as the various correlations. Finally, the linear correlation indicates the variables are changing at the same ration, and the non linear correlation states that the variables are not changing at the same ratio.

The correlation analysis in SPSS is simple, but it requires some basic knowledge. According to the observation of Bryman and Cramer (2004), several techniques can be used for calculating the correlation coefficient. However, in the case of SPSS, there are four primary methods which are helpful in estimating the correlation coefficient. Bivariate analysis with Pearson correlation in the analysis menu is used to calculate correlation coefficients in the case of continuous variables. Spearman rank correlation is a method of calculating the correlation coefficients in SPSS while the data is placed in the rank order. The proposed option is also available in the SPSS method is available in the menu as Spearman correlation. Cramer's V, Phi and contingency coefficient are the proper test for calculating correlation coefficient while the data is in the nominal level. Thus, the value is obtained by utilising the cross tabulation in SPSS. Moreover, in the case of the 2x2 table the Phi coefficient is the best option whereas the Contingency Coefficient C is appropriate for all table data.

Furthermore, from the observation of Liu et al., (2003) the coefficient of determination can be determined by utilising the calculated correlation coefficient. According to the view of Bryman and Cramer (2004) coefficient of determination is the variance obtained from the two variables in the analysis. It is simple to get the coefficient of determination from the correlation coefficient as it is just required to take the square of the correlation coefficient.

In document The effect of privatisation of British railways on the satisfaction levels of passengers: a case study of London overground (Page 113-116)