• No results found

CHAPTER 3 Methodology and Implementation

3.3 Data Analysis Methods

3.3.1 Grounded Theory

Grounded Theory is a process in which researchers “bring up” theory that resides (grounded) in the data (Strauss and Corbin, 1998) and is commonly used in the qualitative paradigm. The data in its various forms is typically obtained from the real environment (for example, real programmers working on real software projects). The data is then analyzed by employing a coding procedure to illuminate patterns or concepts that, in turn, generate theories (Strauss and Corbin, 1998). This process is repeated until it reveals no new pattern or concept. Charmaz (2006) regards this procedure as a systematic analysis of the data and states that “the rigor of Grounded Theory approaches offers qualitative researchers a set of clear guidelines from which to build explanatory frameworks that specify relationships among concepts". Kelsey (2003) described the detailed process in Grounded Theory as a “data dance” and presented it in a framework reproduced in Figure 3.1.

Figure 3.1 - The Grounded Theory “data dance” (Kelsey, 2003)

According to Kelsey (2003), the Grounded Theory process is formed on an inductive and deductive cycle. As illustrated in Figure 3.1, the process is a data collection and analysis process where each round of data collection is based (deductively) on findings obtained from the preceding round of data analysis (inductively). This process is repeated iteratively, until the theory becomes “saturated” (Strauss and Corbin, 1998). Saturation is the means of identifying when the appropriate sample size for the study (see Section 3.5) has been obtained. It maps to Strauss and Corbin’s (1998) concept of “theoretical saturation”. Theoretical saturation occurs when:

a. No new or relevant data seems to emerge regarding a category or categories.

b. The category is well developed in terms of its properties and dimensions, demonstrating variation.

c. The relationships among categories are well established and validated.

In other words, in achieving saturation, the researcher has to repeat the process (of data collection and coding) and consequently expand the sample size until it discovers no new data that enhances the theory. Power (Power, 2009), in agreement, states that theoretical saturation is about focusing on interesting findings from the data and finding more evidence until no new insights emerge. Power also explains that there is no “magic” number of samples to satisfy the saturation criteria.

Figure 3.2 expands on the process of building Grounded Theory. As shown in the figure, the process consists of five phases: i) deciding on the research problem; ii) framing the research question; iii) collecting data and theoretical sampling; iv)coding and analysing the data; and v) developing the theory.

The “frame research question” phase differentiates Grounded Theory from the quantitative approach. Grounded Theory does not validate hypotheses that are created during earlier stages of the research. Instead, it starts with a very general research question. Then, any information (found during the coding / analysis process) that is relevant to this research question is used to construct (emerge) the theory (Bitsch, 2005). This is how Grounded Theory determines the scope of the “phenomenon to be studied” (theoretical sampling) during the next cycle. Figure 3.2 also shows that, in the event that saturation is not yet achieved in the cycle, the researcher will revise the coding process on the existing data, or if necessary, revise the data collection process and revise the results of the recent theoretical sampling.

Figure 3.2 - Grounded Theory flowchart (Bitsch, 2005)

During the coding and analysis phase, repeated themes are identified and coded as concepts. Strauss and Corbin (1998) proposed a basic coding sequence within this phase to construct the emerged theory. According to them, the analysis should follow the basic coding sequence illustrated in Figure 3.3 (open coding  axial coding  selective coding). The coder should thus iterate between

primary data and the emerging theoretical framework. Charmaz (2006) regards the initial coding phase as a process of categorizing segments of data with a short name that “simultaneously summarizes and accounts for each piece of data”(Charmaz, 2006). In other words, the emerged codes and categories are

Figure 3.3 - The Grounded Theory analytic process (adapted from Harwood, 2002)

i) Open Coding

Open (emergent) coding is relatively free from constraints and subject only to whatever patterns are emerging from the data (Power, 2002). In this thesis, open coding is used to identify categories of questions in OS mailing lists. This is called inductive data analysis in open coding (Hoepfl, 1997). Data is compared and similar incidents are grouped together and given the same conceptual label if appropriately close. The process of grouping concepts at a higher, more abstract level is called categorizing (Pandit, 1996). The goal here is to create descriptive, multi-dimensional categories which form a preliminary framework for analysis as suggested by Hoepfl (1997). Then, similar data is correspondingly coded based on the preliminary framework. Figure 3.4 illustrates the open coding sequence.

Open coding Axial coding Uncover relationships among categories: - Mini-frameworks - Conditions and consequences - The paradigm Selective coding

Discover the ‘core’ category thus developing a theoretical framework Categories Classify concepts: - Properties - Dimensions Sub- categories Drill down categories: When? Where? How? Why?

Direction of analytic sequence

Wh Concepts Fracture and label the data: In-vivo coding

Figure 3.4 - Open coding sequence according to Strauss and Corbin (1998)

The first step is conceptualizing. It is done by labelling the data in the dataset. A label could be any insight into the data that is possibly related to the study. The next step is categorization, when the data is categorized based on the relationships between the identified concepts in the data. For example, all questions (found in the mailing list) that have a similar topic could be categorized under the same category. Finally, analysis is done on each category to identify and contextualize possible dimensions of attributes in each category.

ii) Axial Coding

The purpose of axial coding is to relate categories to sub-categories (Charmaz, 2006). Thus, axial coding allows the development of major categories, although they may be in the early stages of their development (Charmaz, 2006). It does this through specification of the properties and dimensions of the categories. Hence, Strauss (1987) views axial coding as building “a dense texture” of relationships around the “axis” of a category.

iii) Selective Coding

Selective coding, also called focused coding, is more directed, selective and conceptual than open coding and axial coding (Charmaz, 2006). It is done after the researcher establishes some strong analytic directions from previous coding to synthesize and explain larger segments of the data. “Selective”

means using the most significant and/or frequent earlier codes to sift through a large amount of data. One goal of this process is to determine the adequacy of those codes.

According to Charmaz (2006), various qualitative data mechanisms such as in- depth interviews and focus-group observation can be used with the Grounded Theory approach. While it is noted that in-depth interview would also be useful for this study, the delocalized nature of open source programmers makese an interview approach impossible. Alternatively, there is a suggestion that email could be used as a medium to interview OS programmers that participated in the dataset (to know more about their information requests, as found in the dataset). However, given the fact that the conversations taken from the mailing list archives were several years old in many cases, an email interview seems unrealistic. Hence, group observation through the mailing lists in isolation seems to be more realistic in the context of this study.

In conclusion, Grounded Theory offers an inductive approach that can cast off the theoretical harness of other works in this area, yet is systematic in its application. Hence, this work largely concentrates on the open coding aspect of Grounded Theory, deriving some sub-categories through axial coding.

Related documents