Preliminary Data Analysis - Phase 2 – Scale Development Study

6.3 Phase 2 – Scale Development Study

6.3.1 Preliminary Data Analysis

6.3.1.1 Data Cleaning

The respondents for the data collection of the first study were recruited via the Amazon Mechanical Turk service (10 March 2015 – 23 March 2015). Three hundred respondents completed the questionnaire and all respondents answered all questions. The main reason for this was related to the recruitment of high-trust respondents through the AMT. Thus, there were no missing responses in the data exported from the Bristol Online Survey, which was used to design the questionnaire, and therefore there was no need for analysis regarding missing values in this study. The following four techniques were used in the questionnaire to increase the quality of collected data:

• Two questions were placed in the survey regarding the online brand community where the respondents had to enter the name and an approximate number of community members. The responses were compared with the actual community to make sure that

the respondents have some information about the community. Of the 300 respondents who completed the survey, 17 did not indicate the correct data, either the name of the community or the number of community members. These 17 respondents were immediately dropped from the study.

• Another statement was placed to make sure that the respondents were reading the questions carefully. The question was “Please select strongly agree for this question and then continue” and seven responses in Likert format (Strongly agree = 1 and Strongly disagree = 7) were provided. Of the 283 remaining respondents, 11 selected other answers and so they were removed from the study.

• The answer pattern provided by respondents was carefully observed. Twelve respondents were removed from the study in order to improve the quality of the data.

• The average time to complete the questionnaire was computed by the AMT (Time = 14:08). Nine of the remaining respondents were screened for speeding through survey questions. Although they passed all the above tests, their responses were categorised as low quality and were removed from the data set. In total, 251 responses remained for the following data screening procedure.

6.3.1.2 Test of Outliers

As discussed in Section 5.5.2.5, boxplots and Z-scores were used as two important methods to identify the potential outliers or extreme responses. According to Hair et al. (2010), the acceptance range for the large samples is ± 3.29. The result showed that the majority of the cases were placed in this range. There were three cases with 3.18 z-scores for item “It is important for me to have conversation with other members in the OBC who share the same opinion about the brand”. These three cases were considered for further analysis using the boxplots method to identify if they could be detected as outliers. Table 28 shows the potential outliers detected using the boxplots.

Table 29 Assessment of Outliers Using Boxplot

Item Case Number

I recommend the brand to other members of the OBC

49, 60, 68, 71, 103, 201, 209

I am proud to recommend the brand to other members of the OBC

32,56,87

This brand is my preferred one that can be obviously seen in my participation in the OBC

22,45,83,90

The detected outliers were related to the following three items: “I recommend the brand to other member of the OBC”, “I am proud to recommend the brand to other members of the OBC” and “This brand is my preferred one that can be obviously seen in my participation in the OBC”. As it is completely normal to find some respondents who ‘extremely agree’ or ‘extremely disagree’, they were not considered unique. All respondents appear to be representative of the population and, consistent with the recommendation by Hair et al. (2010), all observations were kept for the next analysis.

6.3.1.3 Test of Normality

In order to assess the normality assumptions, this study, as previously stated followed the recommendations by Kline (2011) to use values of kurtosis and skewness. The positive and negative value ranges were computed for the skewness and they were from .018 to .933 and - .011 to -.738. Kurtosis was also calculated and the result was acceptable, which provided more evidence for normal distribution of the collected data. Regarding the kurtosis, the value ranged from – .176 to – 1.208. According to the recommendation by Hair et al. (2010), the obtained values for both skewness and kurtosis were below 2 and 7 respectively. The results indicated that the data were distributed more or less normally.

In addition, Kolmogorov-Smirnov and Shapiro-Wilk tests were performed for further examination of the normality assumption. The significant level of all items (P = 0.000) was obtained for both tests, as shown in Table 29. Although the values of kurtosis and skewness for all items provided evidence of normality, the results of both tests, K-S and S-W, show a deviation from normality. There are several considerations that need to be noted. First, among the non- graphical tests for normality, Stevens (2009) recommends a combination of skewness and kurtosis because “this allows for separation of the two types of normality violations”. Second, Nunnally (1978) states:

“test scores are seldom normally distributed, even if the number of items is large. Because of the positive correlation among items, a normal distribution would not be obtained”.

Third, there is strong evidence in the literature regarding the robustness of factor analysis estimators that shows exploratory factor analysis and confirmatory factor analysis are robust in respect to different types of non-normality (Field, 2013). In addition, Malthouse (2001) notes that the data distribution obtained from a 7-point scale is not normal. It is important to mention that the factor analysis is still an effective tool when the data are not normally distributed. Finally,

the existence of small deviations from normality does not influence the significance of the results for large samples (Gorsuch, 1983). There is an agreement that a sample size of greater than 200 (N > 200) is considered as a large sample (Field, 2013; Gorsuch, 1983; Hair et al., 2010; Netemeyer et al., 2003).

Table 30 Assessment of Normality Using Kolmogorov-Smirnov and Shapiro-Wilk

Kolmogorov-Smirnov Shapiro-Wilk

Items Statistic df Sig Statistic df Sig

Soc-1 .170 251 .000 .904 251 .000 Soc-2 .164 251 .000 .922 251 .000 Soc-3 .201 251 .000 .894 251 .000 Soc-4 .221 251 .000 .912 251 .000 Soc-5 .118 251 .000 .932 251 .000 Soc-6 .152 251 .000 .883 251 .000 Soc-7 .148 251 .000 .890 251 .000 Soc-8 .173 251 .000 .927 251 .000 Soc-9 .164 251 .000 .913 251 .000 Advo-1 .205 251 .000 .926 251 .000 Advo-2 .265 251 .000 .898 251 .000 Advo-3 .145 251 .000 .914 251 .000 Advo-4 .131 251 .000 .935 251 .000 Advo-5 .143 251 .000 .886 251 .000 Advo-6 .176 251 .000 .892 251 .000 Advo-7 .165 251 .000 .924 251 .000 Co-d1 .210 251 .000 .906 251 .000 Co-d2 .209 251 .000 .925 251 .000 Co-d3 .113 251 .000 .896 251 .000 Co-d4 .153 251 .000 .915 251 .000 Learn-1 .141 251 .000 .935 251 .000 Learn-2 .175 251 .000 .888 251 .000 Learn-3 .168 251 .000 .890 251 .000 Learn-4 .207 251 .000 .923 251 .000 Share-1 .226 251 .000 .902 251 .000 Share-2 .115 251 .000 .928 251 .000 Share-3 .153 251 .000 .899 251 .000 Share-4 .142 251 .000 .913 251 .000 139 | P a g e

In document Customer engagement : conceptualisation, measurement and validation (Page 149-153)