6 General discussion and conclusions
6.2 Data and research methods 1 Data
In the research, two data set have been used. The first data set contains unbalanced panel data from 105 firms that formerly participated in the FADN: 55 arable farms and 50 horticulture firms consisting of greenhouse growers and mushroom producers. The last year of participation in the FADN was 1996, 1997 or 1998. In 2000, these firms participated in an additional survey to provide data about strategic decision making. The combined data set has been used in chapter 2 resulting in 658 observations. In chapter 3, the 39 greenhouse firms have been selected for further analysis providing 242 observations.
The availability of firms participating in the FADN has the advantage that an abundance of data about firm structure, firm performance and personal characteristics are available. Representativeness of the sample is easily determined. Therefore, the survey in-depth focused on additional management information. Although the data set
is very rich, it had some limitations. First, the firms in the sample were rather
heterogeneous. The difference in firm structure of arable farms on the one hand and of greenhouse firms and mushroom producers complicated the analysis of the results. The data set had to be split into two samples, arable farms and horticulture protected cultivation, each consisting of about 50 firms. Second, the focus of the survey was too broad. Questions regarding innovations and strategic changes resulted in a wide variety of answers, which had to be interpreted and classified. Arbitrary choices had to be made if mentioned innovations corresponded with the employed definition of innovation. The limited number of observations reporting either innovation, or strategic changes regarding the application of new principles like diversification required aggregation of these changes into a new variable: firm renewal. The criterion used was that the entrepreneur needed additional knowledge and information to apply the renewal. Limitation of the survey to a specific topic (e.g. energy-saving
behaviour) would have simplified the identification of significant relationships. On the contrary, generalization of the results would have been complicated. Third, a disadvantage of the data set was the lack of panel data about strategic decision
making. FADN panel data were combined with cross-sectional survey data, implicitly assuming management variables like perceptions are constant over time. Analysis of the influence of the family-firm life cycle on key elements of the strategic decision making process in chapter 3 violates this assumption. The results indicate that internal firm characteristics and external developments are differently perceived in the family- firm life cycle stages and are thus not constant over time. The fourth limitation concerns causality. The measured management variables are assumed to explain the strategic changes and innovations. However, the changes took place before the management variables were measured. Ideally, measurement of management variables has to take place during the decision making process. If management variables are constant, the moment of measurement is less important.
The second data set used in this thesis contains cross-sectional data of 93 greenhouse firms still participating in the FADN and was used in chapters 4 and 5. Additional management information has been collected through an extensive oral survey. The management information is partly general, and partly focused at energy- saving behaviour. In chapter 5, because of missing data, 90 observations have been used for analysis. This data set largely overcomes the shortcomings of the first data set. First, the data set is more homogeneous, containing only greenhouse growers.
Second, the survey has been focused at energy-saving behaviour and innovations. Third, contrary to the first data set, cross-sectional FADN data are combined with management variables collected by the survey with only a limited time lag. Chapter 4 does not focus on causal relationships. Difficulties with causalities are avoided in chapter 5 because perceptions are not included in the explanation of presence of energy-saving technologies. Strategies are presumably more steady over time and are included in the analysis. Because of the measurement of management variables and future plans to save energy in one survey, no problems with causality exist.
Both surveys used in this thesis contained open ended questions and multiple choice questions. Analysis of open ended questions required interpretation and classification of the answers, requiring some arbitrary choices. Most questions with limited answer categories used likert scales to score the answers. Likert scales are ratio scales, implying that the data cannot be used directly in e.g. linear regression analysis. The answers had to be transformed to binary variables. E.g. perceptions measured on a five-point likert scale ranging from heavy threat to large opportunity had to be transformed to a binary variable with the options opportunity and threat. Because of the chosen methods, which were not able to handle likert scales, information got lost.
The limitations discussed have consequences for the realization of the general objective. The limitations of the data set used in chapter 2 and 3 (heterogeneity of the firms, lack of focus, and difficulties with causality) and the limited number of
observations and lack of panel data about management variables in both data set inhibited the application of sophisticated research methods covering a more comprehensive part of the conceptual model. Instead, this thesis analyzed less comprehensive relationships in each of the chapters. The relatively small number of observations also limited the number of explaining variables incorporated in the statistical analysis.
6.2.2 Quantitative methods
In this thesis several quantitative methods have been used to address research questions. In this section, the usefulness of the methods are discussed and
recommendations are made. Different criteria determine the choice of the research methods. The choice between a stochastic versus deterministic method is largely determined by the question whether relations have to be tested statistically. If
relationships have to be identified, the nature of the relationship (causality or not) plays a role. Finally, limitations like the number of observations, the availability of cross-sectional rather than panel data, and the measurement of variables play a role.
Limitations on data also put limitations on the research methods. In chapter 2, Probit analysis has been used to determine the impact of variables on firm
developments. The small fraction of observations reporting innovations,
diversification and firm growth caused imbalance between zero and one observations of the explained variables. The large share of observations that don’t report renewal or firm growth cause that the estimated probit models tend to predict only negative occurrences of the phenomena under observation, thus demonstrating the limited predictive power of the model. Furthermore, the limited number of firms and large share of observations reporting no change inhibited the application of both
multivariate probit models and probit models with random effects. Therefore, heterogeneity among the firms in the sample, interrelationships between different types of firm developments like firm growth and innovation could not be accounted for in the analysis. In chapter 5, probit analysis has been used to determine the impact of variables on existing energy-saving technologies and future plans to save energy. Multivariate probit models could not be used because of the limited number of
observations. Therefore, the less advanced technique of univariate probit analysis was employed. Probit models are suitable for identifying univariate models which can be explored in future research when more sophisticated techniques are employed.
In chapter 3, DEA (Data Envelopment Analysis) is preferred over SFA (Stochastic Frontier Analysis). The advantage of DEA, a deterministic technique, is that it is a flexible tool that does not require a functional specification. Therefore, inaccurate assumptions about the functional specification are avoided. A disadvantage is that DEA is more vulnerable for measurement errors and outliers. However, the procedure about gathering and handling information in the FADN reduce the risk for measurement errors. Besides, the super efficiency approach did not detect any outliers. DEA is limited to technical measures of performance like technical
efficiency, scale efficiency and productivity growth. The overall performance of firms is determined by both technical performance and market performance. Market
performance represents the proved ability of an entrepreneur to realize a high turnover for his products, given the quality and size of the production. DEA would gain
In chapter 3, both Tobit (and OLS) are used for analysis of the second stage. Compared to OLS models, an additional feature is that TOBIT is able to handle censored data.
In chapter 4, cluster analysis is used to distinguish family-firm life cycle stages within the sample. The application of cluster analysis contains some subjective elements. Based on the theoretical model, the researcher motivates the variables that constitute the criteria to distinguish groups. In the second stage, differences between the groups are tested. The (subjective) choices in the first stage have an impact on the results in the second stage. Changes in the selected variables in the first stage would constitute differences in the composition of the groups and thus influence the
differences between groups regarding the variables investigated in the second stage. However, in the case of an inferior theoretical model, cluster analysis would not have led to identifiable life cycle stages.