Data Analysis Technique - Research Design

4. METHODOLOGY

4.4 Research Design

4.4.3 Data Analysis Technique

Statistical techniques are used to analyse the quantitative data and these techniques can be broadly divided into two types based on their applications (Ary et al., 2009; Leedy & Ormrod, 2009; Saunders et al., 2009; Walter, 2009; Wiersma & Jurs, 2009). Descriptive quantitative data analysis techniques (descriptive statistics) as the name applies are used to explore, present and describe data while inferential quantitative data analysis techniques (inferential statistics) are used to make inferences about large population by using small sample data (Leedy & Ormrod, 2009; Saunders et al., 2009).

In descriptive statistics, data is summarized and displayed in tabular, graphical or numerical forms to make it easy to understand (Anderson et al., 2003). Tabular and graphical methods include frequency table, cross-tabulation or contingency tables, quadrant analysis, bar charts, pie charts, histograms, dot plot, Ogive, stem-and-leaf display, pareto diagrams, boxplots, scatter diagrams, correlation and mapping in order to describe and show the relationship between different variables in the data set (Anderson et al., 2003, 2010; Ary et al., 2009; Cooper & Schindler, 2008; Saunders et al., 2009). Cross tabulation or contingency tables, correlation and scatter diagrams are frequently used to describe the relationship between two variables while all other methods are used to represent data for one variable at a time (Anderson et al., 2003, 2010; Saunders et al., 2009). Numerical methods of data analysis deal with the frequency distribution, measure of central tendency, measure of variability and measure of relative position, exploratory data analysis, weighted mean and measure of association (Anderson et al., 2003, 2010; Ary et al., 2009; Cooper & Schindler, 2008; Saunders et al., 2009).

In inferential statistics, inferences are made about large populations based on observations of a small sample representing that population (Leedy & Ormrod, 2009; Zikmund et al., 2010). According to Leedy & Ormrod (2009, p. 275), inferential statistics have two core functions:

1. To estimate a population parameter from a random sample

2. To test statistically based hypothesis

Estimation of population parameter from sample statistics involves two types of estimates, a point estimate and an interval estimate (confidence interval) (Leedy & Ormrod, 2009; Wiersma & Jurs, 2009). A point estimate is a single value estimate that represent reasonable estimate of the corresponding population parameter while in interval estimation an interval (confidence interval) is defined on the scale of measurement that contains acceptable estimates of the parameter (Leedy & Ormrod, 2009; Wiersma & Jurs, 2009). According to Cooper & Schindler (2008), Leedy & Ormrod (2009) and Wiersma & Jurs (2009), interval estimation is more accurate, frequently used and preferred over point estimation.

A hypothesis may be defined as “formal statement of explanations stated in a testable form” (Zikmund et al., 2010, p. 509). Hypothesis testing or significance testing is the second major function of inferential statistics and can be categorized on the basis of number of variables involved in hypothesis testing i.e., univariate, bivariate and multivariate hypothesis testing (Saunders et al., 2009; Zikmund et al., 2010). There are two statistical techniques which are employed for hypothesis or significance testing: parametric statistics and non-parametric statistics (Leedy & Ormrod, 2009; Saunders et al., 2009; Zikmund et al., 2010).

Parametric statistics are used when data involved is numerical, having known and continuous distribution (normal sampling distribution with bell shape), interval or ratio scaled and having large sample size. Contrary, non-parametric statistics are employed when data is not normally distributed (without known distribution) and may be termed as ‘distribution free’ (Leedy & Ormrod, 2009; Saunders et al., 2009). Parametric tests include analysis of variance (ANOVA), analysis of covariance (ANCOVA), regression, factor analysis and structural equation modelling (SEM) while non-parametric tests include sign test, Mann-Whitney U, Kruskal-Wallis test, Wilcoxon matched-pair signed rank test, chi-square goodness-of-fit test, odds ratio and Fisher’s exact test (Leedy & Ormrod, 2009).

The chi square distribution or chi square test for significance or chi square for the independence of categorical variables is the most appropriate, common and simple non- parametric test used for the nominal data such as counts or frequencies within categories

(cross tables or contingency tables) in research (Burns & Burns, 2008; Cooper & Schindler, 2014; Saunders et al., 2012). However, the two essentials assumptions that must be fulfilled in using chi square test include (Burns & Burns, 2008; Cooper & Schindler, 2014; Saunders et al., 2012);

1. The categories used in analysis must be mutually exclusive

2. In 2x2 chi square analysis the expected frequency in all cells should at least equal or greater than 5. However, with more than 2x2 tables, the expected cell count should be equal to or greater than 5 in at least 80% of the cells. In case, if this condition is not fulfilled for 2x2 contingency, it is necessary to use grouping of low frequency categories or use Fisher’s exact test. There is one disadvantage of grouping categories together that it reduces the available information. For more than 2x2 contingency tables this assumption is fulfilled by increasing the sample size only.

However, if any one of the two essential assumptions of chi square statistics are not fulfilled or the number of observations obtained for analysis is small, the test may produce misleading results. Therefore, in this situation a more appropriate and robust test, Fisher’s exact test, is recommended and used for assessing the difference between two variables (Bower, 2003; Cooper & Schindler, 2014; Saunders et al., 2012).

As already explained, this study used both types of data analysis techniques i.e., qualitative and quantitative. The qualitative data analysis technique (in-depth interviews) was used to explore the Pakistan citrus supply chain, key players involved in it and their functions and most importantly how different players, particularly citrus growers and pre-harvest contractors, make decisions regarding the choice of citrus marketing channel. Both descriptive and inferential quantitative data analysis techniques (Fisher exact test and conjoint analysis) were used for the analysis of the main data of this study. Using descriptive data analysis techniques, results were presented in tabular and graphical forms to make logical conclusions. Depending upon the nature of data (nominal or categorical) Fisher’s exact test was used which is the most suitable non- parametric test for this type of data.

Methods or techniques being quantitative in nature and produce results in numbers should not be preferred over qualitative research (Ghauri & Grønhaug, 2005). It is the research question, problem and its purpose which help to decide the suitable research methods (Ary et al., 2009; Ghauri & Grønhaug, 2005). Moreover, the suitability of a

research method largely depends upon the credibility of research findings (Bryman, 2008; Cooper & Schindler, 2008; Saunders et al., 2009; Walter, 2009; Wiersma & Jurs, 2009; Zikmund et al., 2010). Reliability, replication and validity are the prominent tools for the evaluation and credibility of the selected research instrument (Ary et al., 2009; Bryman, 2008).

The aim of this study is to identify and analyse the factors that affect citrus growers and contractors selling decisions in the supply chain of citrus (Kinnow) fruit. It has already been identified that farmers do not consider only monetary value (profit maximization) while making selling decisions of their produce. They are also influenced by a number of transactional cost, socioeconomic, psychological and demographic factors (Carr & Tait, 1991; Fairweather & Keating, 1994; Gasson & Potter, 1988; Herrmann & Uttitz, 1990; Willock et al., 1999). To capture the effect of psychological factors in decision making various personal variable models have been proposed, for example, Fishbein’s model (1963) (Brascamp, 1996). Personal variable models take account of beliefs, attitudes and intentions as psychological factors in the process of decision making (Brascamp, 1996; Fishbein & Ajzen, 1975). Fishbein (1975) replaced the previous subjective expected utility (SEU) model of Edwards (1954) of behaviour decision theory in which the decision maker had to choose the alternative with the highest expected utility.

The subjective expected utility of a given alternative can be expressed as:

SEU = n∑i=1 SPiUi

Where

SEU is the subjective expected utility linked with a given alternative

SPi is the subjective probability that the choice of this alternative will lead to some

outcome i

Ui is the subjective value or utility of outcome i

n is the number of relative outcomes

Fishbein (1975) reinterpreted the subjective utility model and proposed a model based on individual’s attitudes and beliefs that can be expressed mathematically as;

90 Where

AB ~ SEU represents individual’s attitude toward the behaviour

bi ~ SPi and represents beliefs about the consequences of performing a given behaviour

ei ~ Ui represents the evaluations associated with the different outcomes

These models are relatively simple in their execution and measure only a few variables and cannot be used to analyse complex decision making process involving various variables of different nature (Brascamp, 1996).

Poole et al. (1998) identified important factors affecting producers marketing decision in a survey of 300 orange and mandarin producers in Spain. They analysed the data by using both descriptive/exploratory and explanatory statistics. In descriptive statistics, data analysis was conducted using SPSS and measured the central tendency, dispersion and skewness of the data. It helped to eliminate any outliers or incorrect data entries from the data set. By using explanatory statistics, Poole et al. analysed producer’s marketing characteristics using chi-square test for independence of variables. Finally, a multivariate technique (cluster analysis) was used for statistically significant variables to group and confirm the attributes of the respondents that affect the marketing decision.

This multivariate technique is used for clustering or developing meaningful groups of individuals and only used to identify mutually exclusive groups based on some similarities (Hair et al., 2010).

Fert & Szabo (2002) used multinomial logit model (a multivariate technique) to identify and explain the choice of farmers among various supply channels in the Hungarian fruit and vegetable sector. In a postal survey with sample size of 66, they only identified and pointed out the significant demographic and socioeconomic variables that could affect the choice among four marketing channel (Wholesale markets, wholesalers, marketing cooperatives and producer organizations).

Tano et al. (2003) used conjoint analysis (a multivariate technique) to quantify farmer’s preferences for cattle traits in the sub-humid zone of South Africa. Data were collected from a sample of 299 cattle-owning households through survey. In order to demonstrate each cattle profile to survey respondents, they presented cards with pictorial representations of the difference in the levels of traits. The respondents were asked to rate each cattle profile on a preference scale from 1 (least desirable) to 5(most

desirable). From these ratings, part worth values for all the factors were estimated. By using conjoint analysis, both significance and utility (part worth value) of factors affecting the farmer’s preferences for cattle traits were analysed successfully.

McDermott, Lovatt, & Koslow (2004) conducted a study to investigate the performance measures that were important to New Zealand beef producers and processors in their selling and buying decisions using conjoint analysis methodology. Seven key factors in two different contexts, spot market and contracted supply, were selected to analyse the producer’s selling decision namely: price level; payment security; quality assurance branding; space allocation lead time; sharing of processing company direction and market positioning; comfort with the buyer; and quality and effort reward. After assigning levels to each factor, eight different scenarios (cards) using fractional factorial design were created and producers were asked to rank and rate these scenarios. Similarly, in order to analyse processor decisions to buy and sell beef cattle, seven key performance measures were identified and selected namely: livestock price; livestock lead time; quality variability; traceability; supply relationship; grade; and meat quality.

In document Factors affecting marketing channel choice decisions in citrus supply chain : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Agribusiness at Massey University, Palmerston North, New Zealand (Page 105-110)