Codes and coding - Applying grounded theory

Part III: Applying grounded theory

3.23 Codes and coding

3.23.1 What are codes and what is coding?

In qualitative data analysis, a code is a researcher-generated construct that symbolises or

‘translates’ data (Vogt et al. 2014: 13), and which, according to Saldaña (2016: 4) “most often takes the form of a word or short phrase that symbolically assigns a summative, salient, essence-capturing, and/or evocative attribute for a portion of language-based or visual data.” The importance of accurate coding was highlighted by Strauss (1987: 27), who stated that “any researcher who wishes to become proficient in qualitative analysis must learn to code well and easily. The excellence of the research rests in large part on the excellence of the coding.”

The portion of data represented by an individual code is determined by the researcher and represents his or her subjective interpretation of what that code represents. Charmaz (2001) describes coding as the ‘critical link’ between data collection and their explanation of meaning, where the data may be represented by interview transcripts, participant

observation, field notes, journals, documents, or other audio-visual sources. Codes therefore attribute interpreted meaning to each individual datum for the later purpose of pattern detection, categorisation, building of theory and other aspects of analysis (Saldaña 2016: 4).

Which coding method to assign to a study is related to the central and related research questions (Saldaña 2016: 70). Accordingly, the answers that a researcher seeks will influence the specific coding choices made during the research planning and design stage.

Research questions should also harmonise with the philosophical assumptions underlying the methodology, in that the epistemological and ontological positions taken should be compatible.

As noted by Saldaña (2016: 25), Friese (2014) contends that qualitative research projects should never venture into the thousands for a final number of codes, recommending between 50 and 300. Whether these numbers refer to unique occurrences of individual codes or include multiple occurrences of the same code is unclear, but the number of codes expressing each of the three distinct grounded phases of this study were 513, 939 and 755 respectively (see Appendices S1, S2 and S3). Codes appearing more than once are counted and contribute towards a stronger expression of their driving properties.

Codes occurring just once are labelled as ‘self-referents’ on account of their solo presence, but may still contribute to a strongly elicited concept when considered across the whole study, and so must never be excluded. Self-referents can be found in the lower rows of the conceptual indicator tables in the ‘RQx’ Appendices.

Whether viewed as one large GT study, or three ‘staged’ phases of GT inquiry, the number of codes, properties and the conceptual indicators driving them was considered to be perfectly adequate for the task at hand. When viewed from the perspective of a single, all-encompassing GT endeavour, the total number of codes yielded somewhat exceeded Friese’s recommendations, but when considered as an inter-related, three-phased GT approach, the numbers were within an acceptable range. Furthermore, when the duplicates and sole-referent ‘outliers’ were excluded, the number of total codes reduced significantly.

3.23.2 Open coding

Open coding is common to the initial stages of many types of qualitative data analysis, including the GT approach deployed here. It begins with the line-by-line comparison of incidents to each other in the data. Open coding helps to develop the analyst’s theoretical sensitivity by encouraging the detection of patterns, similarities and affinities between the concepts and themes embedded within the data.

3.23.3 Substantive coding

Substantive coding is the process of distilling, via conceptualisation, the empirical essence from the data in which the theory is grounded. Incidents are excerpts of raw, empirical data containing properties that point towards one or more conceptual indicator or category, and from which a grounded theory is eventually generated.

During early coding iterations, codes tended to be overly-descriptive and repetitive; a point taken by the researcher to indicate that more comparison is needed in order to further fracture the data and reduce any conceptually-dense codes to those of a more conceptually-loose nature. Overly descriptive codes take the form of many-word expressions (e.g. Learning how to cope with change), whereas the less conceptually-imbued codes are limited to fewer expressive key words (e.g. Adapting to change).

Table 3.6 Types of Coding

3.23.4 Axial Coding

Termed ‘axial’ because coding occurs around the axis of a category, categories are linked at the level of their properties and dimensions (Strauss & Corbin 1998). Once the open codes have been labelled, itemised and grouped, axial coding is the process used to re-build data that has been fractured via open coding. The data is re-organised so as to identify causal relationships between the categories and sub-categories. The aim is to make explicit connections between categories and sub-categories, where the explanation and underlying relationships between categories that account for the phenomenon to which they relate are brought together to form a ‘paradigm model.’ Data coding at this level is intended to elevate the data to higher levels of abstraction, and as the coder becomes more theoretically sensitised to the data and the coding procedure, the fit between conceptual indicators and category becomes more apparent.

3.23.5 Selective coding

In selective coding, the core category is identified and systematically related to other categories. During the process, categories are further developed by refining relationships over various iterations, which in turn are progressively validated against the data. As the categories become more integrated, theory begins to appear. The process may be summarised thus:

• Explication of the story line

• Relating subsidiary categories to the core category using the paradigm model

• Relating categories at the dimensional level, by understanding the range of values that a category may have. For example, the category 'disposition to learning' may have a range of values between highly disposed and motivated.

• Validating relationships against data

• Further refinement of the storyline

All categories are based around the ‘core’ category which represents the central phenomenon under investigation. Once identified, a storyline is developed that restates the research question in terms of its relation to the core category.

In document Entrepreneurial inference in the high-technology start-up: a model for optimised decision making and principled praxis (Page 166-170)