Data processing and analysis - Data collection process

5. DATA COLLECTION AND PROCESSING

5.2 Data collection process

5.2.3 Data processing and analysis

In order to reduce and systemise data to enable its display, it needs to be processed in a systematic way (Miles and Huberman, 1994; Kuckartz, 2007; Saldaña, 2009). Content analysis is one way to draw systematic patterns from data (Easterby-Smith, Thorpe and Jackson, 2015). Coding is the method of choice in content analysis to reduce the data to patterns of meanings, named codes (Kuckartz, 2007; Saldaña, 2016), and chunk the written word into paragraphs of meanings (Easterby-Smith, Thorpe and Jackson, 2015). Codes are then structured to categories and categories to themes and concepts that create meaning on an abstract level (Saldaña, 2009). Miles and Huberman (1994) describe that data from codes can be collected deductively and inductively, or both, depending on the research approach. A grounded theory approach (Glaser and Strauss, 1998) concentrates of non-pre-defined data collection with codes developing from the set of data available. Deductive approaches develop a certain set of categories before the data analysis starts and examines the available data for meanings that belong to the pre-defined codes and categories (Miles and Huberman, 1994). King (2012) introduced the template analysis that suggests a template of codes and categories adjusted to the conceptual research framework that is then extended by the data that has been collected (King, 2012). He therefore combines deductive and inductive approaches. Additionally, Miles and Huberman (1994) suggest that inductive and deductive approaches can be combined depending on the data collected and the way the researcher wants to access the systemising of data.

Coding is always yet an analytical process which means that data is already addressed in a selective way (Saunders, Lewis and Thornhill, 2012). Because of that, it is important to develop a data analysis approach that fits to the overall approach of the researcher and how he sees the world (Easterby-Smith, Thorpe and Jackson, 2015). Coding itself is a long process that needs to consider different types of codes as well. In order to understand multiple meanings of codes, researcher suggest two stages coding, meaning a first cycle-coding on a descriptive basis and second cycle-coding that is useful to understand relations between categories and even codes (Easterby-Smith, Thorpe and Jackson, 2015). Additionally, different codes can be introduced like descriptive codes, displaying the main meaning of data chunk, the interpretative code, meaning a code that is already generalised to a certain meaning, pattern codes, that describe connections and relations between codes, and cross pattern codes, which describe same or distinct meanings between cases (Miles and Huberman, 1994).

Especially in multiple case studies it is important to look for commonalities and distinctions, in order to understand generalizable data and draw conclusions (Eisenhardt and Graebner, 2007). Figure 5.6 shows how the data systemising and reduction process can take place. How data systemising, reduction and analysing took place in this study is explained in the analysis chapter in more detail.

Here, the case study specifications are taken into consideration throughout the analysis process as well.

Figure (Extract/ Text/Chart/Diagram/image etc.) has been removed due to Copyright restrictions.

Figure 5.6: Data reduction and systemising process (Source: adapted from Saldaña, 2016)

In order to enhance quality, self-reflexivity and transparency during the entire research process needs to be provided (Easterby-Smith, Thorpe and Jackson, 2015). To provide transparency, the researcher needs to have good and well-prepared data. Data that is collected and processed in a transparent way is vital for the quality of qualitative research (Mayring, 2015). To ensure good quality every step that was done for the data collection and the methodology chapter was documented. All steps undertaken were specified in this chapter. This was done not only for the data collection process but as well for the data processing and analysing process. Self-reflexivity and transparency of the process is continued in the best possible way in the analysis chapter.

5.2.3.1 Expert Interviews data processing and analysis

Data processing and analysis of the expert interviews was done alongside Saldaña’s data reduction and systemising process (Saldaña, 2016) shown in Figure 5.6. Altogether, ten expert interviews were coded and 170 pages containing 90.200 words were analysed. Following Miles and Huberman (1994) the codes were developed deductively and inductively during the coding process. Deductive codes were derived from literature, considering the main themes that shaped the literature review and from the main interview sections of the semi-structured interviews. Main interview sections were BEs, networks, interaction and interdependence in CR, roles in CR and KS in CR. After a first data

screening, the subcategories were developed inductively. All categories and subcategories are displayed in Figure 5.7. When all codes were allocated to the subcategories they were divided into bulks of meaning. Key statements resulted out of that bulks and are displayed and related to theory in the findings chapter. Through forward and backward control during the process by validating the coding categories, human errors were mitigated. Data saturated at interview number nine and the last interview proofed the correctness of the saturation point (Strauss and Corbin, 1990). All key statements listed in the findings chapter were used for the next step of data access, collection and analysis. As the expert interviews were expected to give insights into the definition and demarcation of network and BE structures, as well as the role of the Keystone agent, they were considered to give a first insight to better structure the data collected during the case study investigation.

Figure (Extract/ Text/Chart/Diagram/image etc.) has been removed due to Copyright restrictions.

Figure 5.7: Data reduction and systemising process of expert interviews (Source: adapted from Saldaña, 2016)

5.2.3.2 Case Study data processing and analysis

For case study analysis, again data processing and analysis was done alongside Figure 5.6 Saldaña’s (2016) data reduction and systemising process. As two case studies were conducted, the process of data systemising and coding, category deduction and induction was realised twice. The same coding categories were used for both cases to ensure a comparability during the cross-case analysis. Not all categories could be filled by the same amount of data or with the same density as the two cases had a slightly different emphasis. Additionally, not exactly the same kind of interviewees and not the same network agents could be accessed in both cases. As shown in Figure 5.1, the number of observations, semi-structured and open interviews on distinct levels differed. Nevertheless, despite a slightly different emphasis of both cases, the data accessed overall contained a similar repetition and thickness (Saldaña, 2016; Strauss and Corbin, 1990). Figure 5.1 also shows with whom what topics were addressed. Often open interviews took place after official network meetings, which were attended through observation. Especially, the open interviews were often conducted in informal settings, such as during coffee breaks or after work dinners.

The observations and the semi-structured interviews were conducted in more formal settings such as meetings rooms or offices. Different meeting participants were attending the meeting throughout the period of data collection between June 2016 and July 2017. Due to the length of study and the great variety of methods chosen, nearly all network participants were met during the period of investigation.

Only the network participants that were interacting closely with the identified Keystone person were included in open interviews, such as Niche players, network management and dominating companies (Iansiti and Levien, 2004a). By addressing network members that were near to the Keystone person, and in close contact, the interview data could mitigate a selection bias as numerous and distinct informants contributed with their views to focal phenomena (Eisenhardt and Graebner, 2007). All details about the networks chosen and their participants are listed in the case study chapter.

The data processing and analysis process of the case studies is displayed in Figure 5.8. The process was the same for both cases to ensure a better comparability of data collected (Eisenhardt and

Graebner, 2007). Providing a transparent data procession and analysis process is key for qualitative data sets. Especially, as researchers questioned the validity and reliability of qualitative data throughout its history (Yin, 1981). In order to avoid misunderstandings and to display how theory was inducted, all steps of the data processing and analysing framework need to be outlined. Also this enables the systematic use of cross-case comparison techniques (Eisenhardt and Graebner, 2007).

Miles and Huberman (1994) suggest that the coding scheme needs to be structured and logical and should refer to the framework that has been developed during literature review. All steps undertaken during data processing and analysis are shown in Figure 4.4 and described below.

In order to prepare the data for coding, all data collected was audio recorded, transcribed and coded manually. All investigations undertaken are also displayed in Figure 5.1. Altogether, Case I contains 22 documents of primary data. 6 open interviews (with Keystone person, Niche player, Dominator, Keystone company employees), 5 semi-structured interviews (with Keystone person, Niche player, Dominator, network management) 10 observations (of network meetings) and one email from a Keystone about network development, which are altogether 98.000 words that were coded and analysed. The network investigated in Case I developed just before the study started and some interviews were conducted with members of the old network out of which it developed and are part of Case I.

Case II contains 27 documents of primary data. 14 open interviews (with Keystone person, Niche player, Dominator, Keystone company employees), 3 semi-structured Interviews (with Keystone company employees), 10 observations (of network meetings) which are altogether 51.000 words. In both cases, the saturation point of data was reached after having conducted 3/4 of both studies. They were then extended for another few weeks in order to proof the correctness of the saturation point (Straus and Corbin, 1990).

Additional to the primary data set, secondary data was collected. As with primary data, secondary data should be collected following a certain research strategy, which ensures that the data is not outdated and from suitable sources (Boslaugh, 2000). Secondary data could be any kind of document

that serves to give additional insight into the case and answer the research question (Eisenhardt, 1989a). Secondary data for Case I and Case II was collected solely from network member company websites or websites dedicated to the network investigated. Additionally documents served by network members or Journal articles contributing to industry facts about the industries the networks are located in were explored. Secondary data was screened and mainly used to provide a comprehensive introduction of the cases studied and the companies investigated.

Primary data processing and analysis was started with an in-case analysis. Data collection, process and analysis took place simultaneously (Miles and Huberman, 1994; Eisenhardt, 1989a). As a first step of the in-case analysis, all data was screened by sighting every document that belonged to the case to become an overview of the case as ‘stand-alone entity’ (Eisenhardt, 1989a, p.540). A first level coding as displayed in Figure 5.8 was then conducted by coding statements out of the documents created from transcription of the audio records. The statements developed out of the first round of coding were allocated to the deductively derived main categories and inductively developed subcategories. “This process [of first allocation] allows the unique patterns of each case to emerge before investigators push to generalize patterns across cases” (Eisenhardt, 1989a, p.540). This overview of the single case, when the same coding categories are used for both cases, was done to enable better comparability. Inductive and deductive coding was combined as subjects of the literature review were looked for in the data set to match the existing framework (Miles and Huberman, 1994; Eisenhardt, 1989a) and inductive coding subcategories were developed to match existing data to the categories (Saldaña, 2016). The combination of deductive and inductive methods can help to link data to existing theoretical concepts, indeed inductive and deductive methods can be a mirror of each other (Eisenhardt and Graebner, 2007). As Figure 5.8 shows, the more the data is processed, the more it becomes generalizable. Therefore, the first level coding was essential to underline the differences of both cases that could help to develop cross-case analysis (Eisenhardt, 1989a). As all coding needs a certain revision (Miles and Huberman, 1994) and coding is done in repeating loops (Eisenhardt, 1989a) all categories and subcategories were scanned in a second level

coding in order to delete doubled categories and ensure that codes with multiple meanings were allocated correctly (Miles and Huberman, 1994). The second level coding enabled the researcher to simplify the first level coding, as subcategories that were developed inductively were found to be overlapping and complex. Therefore, the second level coding helped to create data of a greater density as the data in the first level coding was rather ‘extended text’, extensive and poorly ordered (Easterby-Smith, Lyles and Tsang, 2008; Miles and Huberman, 1994), than well distributed data sets. During both coding processes data was labelled so that its distribution to categories could be comprehensible (Easterby-Smith, Lyles and Tsang, 2008). After these two steps, two important documents for every case existed. The first level coding document, which offers a complex insight into the data of both cases and which enables to get further comparative meaning by case comparison at a later stage, and the second level coding document which already displays a first data set for the results chapter. The next step, the cross-case analysis ensured another level of abstraction and is essential for the analysis of multiple cases (Eisenhardt, 1989a). This procedure is explained in more detail in the cross-case analysis section.

Figure (Extract/ Text/Chart/Diagram/image etc.) has been removed due to Copyright restrictions.

Figure 5.8: Data reduction and systemising process of case study analysis

All coding levels displayed in Figure 5.8 serve a certain aim and are needed throughout the whole research process. They are a mixture of inductive and deductive coding categories. Central to multiple case studies is that they need to balance storytelling with generalisability (Eisenhardt, 1989a). In order to be able to compare case, a certain generalisability is needed but case specifics are still important.

As theory developed from cases is shaped by repetition and every case is his own analytical element, the theory building need to take place through ’recursive cycling among the case data’ (Eisenhardt and Graebner, 2007, p.25) especially as the closeness to original data sets keep researcher near to the reality. Because of this replication logic by different degrees of generalisation, due to the need of comparability multiple cases can develop better theory than single-case research but data sampling and analysis is more complicated (Eisenhardt and Graebner, 2007; Yin, 1994).

Due to this specifics of multiple case studies, there is often a huge chasm between initial data and conclusion. Especially due to the vast amount of data collected in a case study (Eisenhardt, 1989a).

Miles and Huberman (1984, p.16) addressed that difficulty: "one cannot ordinarily follow how a researcher git from 3600 pages of field notes to the final conclusion, sprinkled with vivid quotes they may be.”

5.2.3.3 Cross-case analysis

As already mentioned above, the last and necessary step for case comparison is the cross-case analysis. Again, a certain frame of analysis need to be stick to as especially in the early days of qualitative analysis the cross-case analysis was under critique as being ‘even less well formulated than within-site analysis’ (Miles, 1979, p.599). The main problem was identified as being the tension between case uniqueness of a single case and the generalisability for theory sampling (Miles, 1979;

Yin, 1981). Indeed, multiple case studies cannot provide a long narrative of specific explanations (Yin, 1981) but need to balance data richness with generalisability (Eisenhardt and Graebner, 2007).

This critical approach to multiple case study can be met by introducing a clear conceptual research framework of what is aimed to be studied (Yin, 1981). The conceptual research framework of this

study has been discussed in chapter three and four. Multiple cases can only be compared when similar data sets are analysed, therefore the same coding categories and the same coding steps were used in both case, sticking to the conceptual research framework of this thesis. This ensures the possibility to find out if findings occur in both cases (Saunders, Lewis and Thornhill, 2012; Bryman, Stephens and Campo, 1996) by developing cross patterns. Cross patterns are generalised meta patterns (Miles and Huberman, 1994). Here, it is important to take into account that researchers are often poor in systematically processing information. Cross-case patterns are often influenced by their first impression. In order to avoid that bias it is important to look at the data from many different angles.

One possibility is to select dimensions that are important across the cases. They could be on a meta level, being abstract, overlapping all categories (Eisenhardt, 1989a). This was done in a third level coding step again displayed in Figure 5.8. During that coding procedure the meta level codes of characteristics and actions of Keystone on different levels of investigations were developed and also helped to redefine the research questions (Eisenhardt, 1989a). This was done by looking for characteristics and actions that were displayed in both cases and in all categories of coding. These cross-case patterns were then ordered and the main statements for every cross pattern category were allocated to be displayed in the findings chapter. The findings for every cross-case pattern category were then validated by re-assuring their main propositions by comparing them with the statements of the first level coding on case study level. “The process of building theory from case study research is a strikingly iterative one. While an investigator may focus on one part of the process at a time, the process itself involves constant iteration backwards and forward between steps. For example, an investigator may move from cross-case comparison, back to redefinition of the research question (Eisenhardt, 1989a, p.546).”

A second tactic, also suggested by Eisenhardt (1989) was used to understand the subtle differences and similarities of each case. These cannot be displayed on a meta-analysis, but need a closer inspection. Therefore, the coding categories that related to the research context, aiming at explaining the environment the Keystone act in and develops his characteristics and actions in, were analysed

for similarities and differences (Eisenhardt, 1989a). This also supported the understanding of differences of Keystones actions and characteristics that ground in this contextual differences.

Consequently, cross-case analysis and case comparison took place on all levels of abstracting. Due do this technique new categories can evolve, that were not anticipated before. Another tactic is suggested by Eisenhardt (1989a) as being the division of data by data source. As the case studies displayed here contain a high number of different documents and sources this tactic was not used.

Within case analysis and cross-case analysis contributes to theory development by the identified patterns within cases and across cases. These patterns are linked to existing theory and are shaped by logical arguments (Eisenhardt and Graebner, 2007). While single case studies are shaped by great detail and storytelling, they result in more complicated theories due to the recognition all particularities. Cross-case analysis can detect less phenomena but presents a higher data sickness and is more robust. Multiple case study analysis requires a balance between degree of detail and replication logic. “If the researcher relates the narrative of each case, then the theory is lost and the text balloons. So the challenge in multiple-case research is to stay within spatial constraints while also conveying both the emergent theory that is the research objective and the rich empirical evidence that supports the theory (Eisenhardt and Graebner, 2007, p.29).” Both authors suggest that the best way to address these particularities is to develop theory in sections of meanings which are shaped by the theoretical framework (Eisenhardt and Graebner, 2007; Miles and Huberman, 1994).

As this research is shaped by two cases, the balance between case richness and data richness is

In document Knowledge Sharing and Innovative Strategies in Organisational Collaborative Relationships: The Potential of Open Strategy (Page 167-179)