• No results found

CHAPTER 6 PRELIMINARY RESULTS

6.5 SAMPLE CHARACTERISTICS (RESPONDENTS‟ PROFILES)

characteristics of the respondents by Age, Gender, Income Level, Income Source, and

Educational Level, are summarised previously in Table 6.3 and 6.4.

6.5.1 Age

Across both samples, the largest concentration of respondents participating in the survey is clustered around the 25 to 44 and 45 to 64 age groups. Collectively, these two groups represent 81 percent of respondents in the Taxpayer sample and 93 percent of respondents in the Tax Agent sample. Both samples are underrepresented by respondents belonging to the

129

It should be noted that older members may not be tertiary educated.

130

A few blank questionnaires were returned in the pre-paid envelope, noting that they did not have the knowledge to complete the questionnaire. This suggests that some tax knowledge is required to be able to confidently complete the questionnaire.

131

Most income types listed in the questionnaire are subject to a form of withholding tax, except income earned by those who are self-employ.

162

under 25 years and over 65 years age brackets. Nonetheless, the age of respondents cover all the range of categories established for the survey.

6.5.2 Gender

Gender is equally represented in the Taxpayer sample, comprising of a 50:50 split, of males and females, indicating that an equal number of males and females participated in the survey. In contrast, the Tax Agent sample has a higher percentage of males (61 percent) compared to females (39 percent). However, as discussed previously, this is consistent with the data obtained from NZICA and is representative of the total NZICA population.

6.5.3 Income Level

The largest group of respondents (37 percent) from the Taxpayer sample are in the higher income bracket (that is, over $60,000), while the mid-income brackets ($20,000 to $39,000 and $40,000 to $59,000) are also well represented in the sample. A reasonable number of respondents earn under $20,000. The results indicate the Taxpayer sample covers all income brackets created for the survey.

Respondents from the highest income bracket ($60,000 and over) are significantly overrepresented in the Tax Agent sample, with comparatively marginal representation in the lower income brackets.132 This is representative of the level of income earned by members of this profession. Notwithstanding this situation, all categories of identified income brackets are covered by participants of the survey.

6.5.4 Income Source

A large percentage of respondents from the Taxpayer sample are salary and wage earners (46 percent), followed by those who are self-employed (29 percent). Similarly, the Tax Agent sample is dominated by salary and wage earners (75 percent), followed by the self-employed (21 percent). Full-time students are not represented in either sample, while the remaining sources are also less represented, especially in respect to the Tax Agent sample. However, all categories, (except full time students) are covered by both observed samples (albeit at a minimal level in some cases).

132

The median annual personal income from all sources for people aged 15 years and over was $24,400 in 2006 (Statistics New Zealand, 2006).

163

6.5.5 Educational Level

Most respondents from the Taxpayer sample (at least 60 percent) have completed a university degree, or trade or vocational training. The third largest group is made up of those with Year 11 or under qualifications. The remainder have either completed Year 12 or 13 or have some other qualifications. The majority of respondents appear to have attained higher qualifications, which may suggest that they would have sufficient knowledge on tax matters to be able to complete the questionnaire. Further, the respondents covered all listed categories.

As expected, a large percentage of respondents from the Tax Agent sample have completed a university degree (86 percent), compared to 5 percent who completed trade or vocational training. Only one respondent had a Year 12 or 13 qualifications, and none had qualifications at Year 11 or under. The level of tertiary educated respondents in this group suggests that most would have sufficient knowledge to be able to understand and respond correctly to the questions on tax and tax issues. While the majority hold higher qualifications, the respondents covered all categories except for the „Year 11 and under‟ category.

In summary, a number of tests were performed on the two data-sets to determine the adequacy of the data for further analysis. The results suggest that the two observed samples are adequately representative of all categories established for the survey, in respect to the selected attributes. In particular, the results suggest that no serious problems are apparent in each data-set that may compromise the results of this study. The next section presents the results of some preliminary analysis, including descriptive statistics.

6.6 PRELIMINARY ANALYSIS

The previous chapter (Chapter 5) provided a detailed discussion of the preliminary data analysis that was undertaken for this study, which includes missing value analysis and descriptive analysis. The missing value analysis was undertaken to ensure the data-sets conform to the missing completely at random criteria, in order to justify the application of the EM technique to address missing data, whereas the descriptive analysis provides basic statistical qualities or properties of the data used in this study. The results of the analysis are presented and discussed in the sections that follow.

6.6.1 Missing Value Analysis

The data-sets used for the missing value analysis have already been screened. As a result, cases and variables with more than 10 percent missing values for the Taxpayer sample and 25

164

percent for the Tax Agent sample were eliminated.133 This process reduced the Taxpayer sample from 191 cases to 180 cases, and the Tax Agent sample from 183 to 164 cases. The process also involved deleting cases with missing dependent variables, except in two instances where the respondents notified that these were intentionally left blank.

The reduced data-set for the Taxpayer research model comprises 180 cases, and 110 indicators, resulting in a total of 19,800 data points, with 149 missing values. The percentage of missing values is extremely low at 0.7 percent. Similarly, the reduced data-set of the Tax Agent model contains 18,040 data points (164 cases and 110 indicators), with 177 data points, or 0.9 percent missing values. The percentages of missing data points for both models appear to be hugely insignificant, and the percentages are also significantly lower than the percentages accepted in prior studies.134

Although the missing values are low, it is equally important to ensure that these remaining missing values are distributed randomly throughout the observations and no distinct patterns are identifiable in the data-sets. Data missing completely at random (MCAR) indicates a higher level of randomness, suggesting that “the cases with missing data are indistinguishable from cases with complete data” (Hair et al., 2006, p.57). Missing data that are not MCAR may cause problems in the generalisability of the results (Tabachnick & Fidell, 2007). Observations are considered to be MCAR if none of the variables in the data-set contain missing values related to the values of the variable under scrutiny (Meyers et al., 2006). A Missing Value Analysis was undertaken in SPSS, which produced the estimation statistics for Little‟s MCAR test. The null hypothesis for Little‟s MCAR test is that the data points are missing completely at random (MCAR). A non-significant value of p = > 0.05 indicates that the data are MCAR. The EM estimates table for the Taxpayer sample reports a non- significant value of 0.399 while the EM estimates table for the Tax Agent sample shows a non-significant value of 0.558, both clearly exceeding the p = > 0.05 threshold.135 The null hypothesis (which states that the data point are missing completely at random) therefore cannot be rejected, and indicates that the missing data points are probably missing completely at random, with no evidence present of any systematic pattern of missing data. This outcome suggests that any estimation method applied should produce unbiased results (Hair et al., 2006).

133

Although the threshold for the Tax Agent sample was set at 25 percent, most missing values for cases and variables were under 10 percent.

134

Missing values as recorded in the following studies: 2.6 percent (Yue, 2004), 2 percent (Vatanasakdakul, 2007), and 7 percent (Venik, 1999).

135

Little‟s Chi-square statistics for testing whether values are MCAR is available as a footnote to any EM estimate table generated by SPSS.

165

6.6.2 Estimation Technique

The choice of missing data imputation was next considered. Tabachnick & Fidell (2007, p. 71) noted that the Expectation Maximisation (EM) methods generally offer the simplest and most reasonable approach to imputation of missing data, as long as the preliminary analysis provides evidence that scores are missing randomly. The missing value analysis confirmed that the missing data points are missing completely at random. Further, a recent study by Kristensen and Eskildsen (2010) compared four different methods of handling missing values in a PLS model, which included the EM substitution, pair-wise deletion, means substitution and regression-based substitution. All of these approaches are available in the SPSS Missing Values module.136 The results provided evidence that the regression technique and the EM algorithm in general outperformed the other techniques examined. The study also noted that, for small fractions of missing values, the two techniques are not significantly different. However, when the fraction of missing values is increasing, the EM algorithm is superior to the regression technique.

After careful consideration, the EM algorithm was selected, on the basis that the EM algorithm was considered to be superior to the other available methods. This ensures the data- sets are complete for both samples, and is adequate for further analysis. The next section presents the results from the descriptive analysis, in respect to selected study variables.