CHAPTER 3. METHODOLOGY
3.11 Initial Data Analysis
This section discusses initial data preparation, coding, and analysis. Data analysis consisted of deductive and semi-inductive content analysis, descriptive statistics, and design mapping. Data analysis was performed in Excel and MAXQDAPlus, with Wordle.com used to generate word frequency visualizations.
3.11.1 Initial Data Preparation
The timeline interview questions were designed to correspond to specific sense-making analysis categories such as steps, questions, and question context (basis, helps, hurts,
etc.). The interview questionnaire determined the data 'groups' that became the field headings in the data analysis spreadsheet. A response was considered to be all of a participant’s words in reply to a specific interview question that were related to that interview question. Little or no additional parsing was required. Every column in the spreadsheet, with the exception of participant ID number, corresponded directly to an interview question specifically designed to elicit that data per Dervin's conceptual framework. This is typical for Dervin's timeline interviews. This approach is different from typical content analysis for an interview, in which a large transcript is analyzed and segments are parsed out for coding.
Initial data preparation consisted of setting up data columns in an Excel spreadsheet for each timeline interview question, entering initial data, and then inserting coding
columns in the Excel dataset. Fields were set up for interreliability coding of: Steps, Questions, Basis for Question, Helps, Hurts, Answer, Source, and Big Picture data. As data collection interviews were conducted in parallel with data analysis, initial coding was begun when three interviews had been completed.
3.11.2 Initial Data Coding and Analysis
Data was first coded using a typical deductive content analysis approach for sense- making data: initial coding using Dervin’s 5 W’s and an H approach - categories of Who, What, When, Where, Why and How (B. Dervin, 1983). For this study, ‘Why’ is the Basis for Question data. This framework helped maintain a broad perspective during initial coding. Dervin’s Five W’s and an H approach is typically followed by what is commonly referred to as inductive content analysis (Brendlinger & Dervin, 1999;
Schamber, 2000), but technically is not truly inductive content analysis. Dervin’s categories from the timeline interview template and the 5Ws and an H are the starting point. It’s more appropriately a form of semi-inductive content analysis (Towne-Roese & Taylor, 2013), applied to code participants’ responses in an inductive manner within the data structure established by the timeline interview.
Semi-inductive content analysis of timeline interview data typically identifies issues such as resources, time, actions, understandings, emotions/self-image/motivation, situational descriptors, answers, problems, or concerns. Behaviors such as iteration are considered patterns that arise from the data, and are not part of initial coding.
Interestingly, coding complex design data with the 5Ws and an H was a very different and more useful experience than coding simpler, linear data, where application of the 5Ws and an H may be quite obvious. Early in initial coding it became apparent that some complex data was difficult to classify within the 5Ws and an H. As a result, semi- inductive coding was begun. As additional data was collected and analyzed, insights obtained during semi-inductive coding helped to clarify complex data, establish coding rules, and work out appropriate classifications within the 5Ws and an H. An example is coding of ‘how much’ questions, which often look like a How, but are actually a What. This distinction can be tricky when complex situations are described by participants in ways that do not necessarily include the words ‘how much.’
Data analysis was expected to be an emergent process at this stage due to the large number of unknowns, the complexity of design data, and the lack of much prior research in this area. Some deductive content analysis was periodically required for
data not yet classified for 5Ws and an H. Later, as intercoder reliability coding
progressed, the work done clarifying the underlying structure of the 5Ws and an H was very useful as a basis of discussion for resolution of coding disagreements.
Additional Sense-Making categories were added as more data was collected, including resources (relevant to bridging a gap), verbings (Sense-Making and Un-making actions while trying to bridge a gap), situational categories (stops, barriers, and constraints), attitudes and emotions (bridging gaps), and goals (B. Dervin, et al., Editors, 2006). Refer to Figure 4.
The intent during initial coding was to maintain trueness to participants own words as much as possible. Initial data analysis was completed prior to coding for issues based on the researcher’s own experience or the literature, or issues that require a higher level of analysis across multiple items or participants. A good example of a design issue that requires a higher level of analysis is iteration, which requires reviewing all of the data for a participant as a whole to try and find instances of repeated steps and/or questions.
Examples of some emergent topic areas included: interaction with peers, inclusion of administrative policies, learning whether a design idea will work by running a pilot class to test both instructional method and a hands-on design project, and receiving guidance from external industry employers.
By the time six interviews were complete, the data contained a broader range of subject matter and design experiences. Initial coding had been completed for the available data, and there were indications of some higher level design issues such as concerns about
cross-disciplinary instructional design and implementation of related instruction. An initial codebook was developed. Up to three codes per response were permitted.
After interview six, data was reviewed and determined to be valid. Some basic patterns and indications of reliability appeared such as similar steps and questions across
participants (examples: I talked to a peer. I looked for materials and references online. How do I know it’s working? I’m concerned because I need to know more about the topic.). This was a good indication that the interview protocol design was sound, but there was not enough data to begin interreliability coding.
All interviews were performed in private offices or private conference rooms. The detailed interview script was followed closely, with the same script used for all interviews. This helped ensure validity.
Researcher bias can never be eliminated completely, but focusing on the participants wording and trying to view each item as a stand-alone was helpful to maintain
objectivity. Preceding the emergent coding with Dervin’s 5 W’s approach was useful to objectively refocus on the data and, to the extent possible, separate the data from the sometimes emotional experience that Jeffress and Porter have stated a timeline interview can be for the researcher (Jeffress, 2013; Porter, 2010).
Next, additional potential coding categories were determined by grouping and examining similar responses and applying researcher’s expertise and results of literature review. This identified several areas with potential for more in-depth analysis. Planning was also underway for design mapping.
At this point, the general coding process was established and would continue in parallel with data collection and analysis. However, a major decision had to be made about how to implement intercoder reliability coding. An intercoder reliability of 90% minimum was anticipated as satisfactory for the proposed study. The following section discusses intercoder reliability in detail.