METHODOLOGY: GETTING THE CREATIVE AND COLLABORATIVE PROCESSES ‘ON RECORD’
3.3 Getting it Down
Conducting ethnographic research in the recording studio presented some unique social and logistical challenges that were fundamentally related to the distinctive architecture of the recording studio and the social setting of a recording session. The construction of a typical recording studio creates a division between the control room and the performance space: ‘with a glass window that isolates the sound of one world from the other’ (Williams, 2011). The first three recording sessions were conducted at Elevator studios in which the live room is on a separate floor of the building to the control
room. This presented an exceptional challenge in gaining observational access to all of the actions and interactions between the participants during the sessions, primarily because it was physically impossible to be in both rooms at any one time without disrupting the recording process. If observation was taking place in the control room during tracking, actions and interactions were not being observed in the live room. It was therefore important to spend a period of time in the live room in order to gain some perspective on what the performing musicians experienced, and similarly in the control room in order to observe the experiences of the engineer and the producer. Consequently, it was determined that a number of CCTV-style cameras would be used to record the actions and interactions in both the live room and the control room. However, there are a number of considerations for the use of video and choices must be made in relation to: ‘where shots are to be taken, whether the camera should be fixed or mobile, whether a single focus is to be adopted or whether the focus should vary; and if so when and how’ (Hammersley and Atkinson, 2007: 148).
In response to these points, the positions of the cameras were determined on the basis of both optimum coverage and unobtrusiveness. At the beginning of the session, I consulted Darren, the engineer at Elevator Studios, on roughly where he would be positioning the members of the band to perform in the space during the recording and three of the four cameras were then fixed in place, focussing on an area of the live room. The selected cameras were chosen specifically because of their unobtrusiveness as they are small, do not have large tripod like stands and transmit their video signal wirelessly to a computer where each of the perspectives are recorded simultaneously as shown in Figure 8 below:
Fig. 9. ‘The Four Camera Perspectives inside Elevator Studios’.
There were distinct advantages to using multi-perspective video, particularly as they assisted in recording multiple modes in the field: that of movement, sound, and sights, and the film as a method of research: ‘makes field enquiries more accessible…we have words, plus intonations, plus pauses, plus facial expressions’ (Loizos, 1980: 60-61). The use of video recording was also fundamental in capturing some of the group processes that occurred during the studio sessions since:
When cognitive processes are distributed across groups, they become visible, and scientists can observe them by analyzing the verbal and gestural
interactions among the participants (Sawyer, 2009: 81).
The video recordings provided the basis for the method of identifying the cognitive emergence of the contributions of each participant known as ‘interactional analysis’ (Jordan & Henderson, 1995) where interactions between the participants could be
further scrutinsed, played and replayed and their visible contributions identified. As a useful advantage the use of multi-perspective video also meant that fieldnotes could be focused on sufficient details rather than describing fully some of the interactions that occurred, particularly where a succession of fleeting interactions would have been difficult to identify, remember and detail immediately.
The permanence of video was also a distinct affordance (Grimshaw, 1982) as the entirety of each recording session could be repeatedly played and replayed, allowing the focus or attention to be changed each time the video was viewed. This also served in the identification of interactions or actions that were not observed during previous viewings or even during the recording session (Erickson, 1982, 1992; Fetterman, 1998). However, the use of still images and film materials does have its limitations because they do not necessarily produce faithful and accurate images rather: ‘these forms of representation are partial and conventional’ (Hammersley and Atkinson, 2007: 148). The unedited videotape is therefore limited in its representation of the event, which is further limited by the video camera’s capacity to only record the events that fall within the range of the camera lens. In this way, the multi-perspective cameras provided a distinct advantage over the use of a single video camera with one
perspective but remained limited in their ability to capture participants’ inner thoughts and feelings during recording. These limitations were further amplified by the
cameras’ technical constraints, that of fixed-focus and reduced frame-rate, which abridged some movements and gestures during the recording sessions. Because of these limitations the video footage was not used to take the place of an additional observer, but rather was used in part to corroborate or correct observations that had been misremembered or misreported when writing down field notes.
The video footage also provided sufficient stimulus when conducting the interviews to
remind participants of particular movements or gestures. The use of video during the
interviews allowed participants to view (and review) themselves throughout the process, which encouraged them to be reflective of their actions and interactions. Whilst playing the videos of the sessions during the interview I also asked the
participants to provide a running commentary on their actions, identify any habitual movements that would have been unperceivable through observation alone, and describe some of their inner thoughts and feelings as the session unfolded. However, the propensity to favour video recording in ethnography and neglect the important role of accompanying sound recordings can have unfavourable effects because: ‘poor- quality audio is always irritating, always detrimental to analysis and presentation, and injurious to results’ (Shrum, Duque and Ynalvez, 2007: 217).
Although video recording has been somewhat privileged in the data collection discussion thus far, all of the video recordings were accompanied throughout the ethnographic process with sound recordings. These were either captured using a digital Dictaphone or through the on-board microphone of camera number 1. Both the video recordings and audio recordings were time-stamped and therefore could be used together to play at any desired location in order to revisit a conversation, a musical performance or a particular movement or gesture. Placement of the Dictaphone was dependant upon the type of session, the placement of the ethnographer and the location of camera number 1; for instance, in the early stages of the process, the Dictaphone was placed in the control room as the majority of the conversations between the participants could be captured through the studio speakers. However, in later sessions the Dictaphone was placed in the live room to capture the conversation and camera 1 recorded the sound in the control room. The portability of the Dictaphone also meant that it was used to conveniently record brief conversations with the participants, which would have otherwise been cumbersome to write-up quickly. The majority of the audio recordings were also transcribed to allow further analysis of the interactions to take place since the written transcriptions, the audio recordings and the videos also allowed an amount of substantiation between the different communicative forms. One could read the transcription and then watch the video and infer something from bringing together the verbal and the visual forms of communication.
The richness of the data provided a more comprehensive record of the communication between the participants, however, neither video nor sound recording were used in
isolation as field notes were used extensively during participant observation. In a similar fashion to the use of both video and sound recording, the annotation of field notes required some consideration, not least because of their practicalities within the field situation. Wolfinger (2002) outlined three considerations for note taking in the field:
First, a researcher will sometimes be able to take notes while in the field. Many fieldwork texts advocate this practice (Berg, 1989; Emerson et al., 1995; Goffman, 1989; Lofland and Lofland, 1984; Schatzman and Strauss, 1973). These preliminary notes generally form an outline when the researcher sits down at the end of the day to type out complete notes. Second, the focus of an ethnographic investigation typically narrows over time (Hammersley and Atkinson, 1983; Spradley, 1979), obviously influencing what a note-taker chooses to describe. Third, note-taking may be influenced by the perceived audience (Emerson et al., 1995). Within these broad constraints, however, an ethnographer will still have to decide exactly what should be annotated (Wolfinger, 2002: 87).
It was further considered that: ‘you cannot get it all. You will do well to get enough of
the “right stuff” even after you decide what the right stuff is’ (Van Maanen, 1995: 97).
The right stuff in this instance centred on the interaction between the elements of the creative system (the individuals, domain and field) and the interaction between the participants inside the recording studio as they completed the tasks of performing, engineering and producing. Fieldnotes were written whenever it was possible and written with a temporal and time-stamped, rather than task-based, focus so that sections could be chronologically identified and matched with the relevant sections of video or audio. One particular advantage to this method is that it encourages the ethnographer to reconstruct the events in the order they actually occurred, which can further encourage recall of other pertinent details (Wolfinger, 2002). As the fieldwork progressed however, the field notes became more and more focused; Emerson et al. define this tacit selectivity of note taking as employing a ‘salience hierarchy’
(1995:48). Thus during the latter stages of the ethnographic process, the field notes paid particular attention to the creative interactions and conversations between the participants, particularly where a decision had been made that would have a noticeable outcome on the final recording. Using both video and sound recording to capture some of the visual and sonic interactions helped to relieve some of the pressure on writing ostentatious descriptions in-situ. Once the participants had left the studio, the field notes were reviewed and additions were made, particularly on aspects that could not have been captured by video or audio recordings such as facial expressions and small bodily gestures (e.g. head nodding). There were also occasions where notes couldn’t be taken, for instance when I was assisted on the session, so the video and audio
recordings were used to review the session once the participants had left the studio and field notes were written whilst the memories of what happened were still fresh. Some of the limitations of these approaches, namely the inability to always interpret the thoughts or actions of the participants, also highlighted a fundamental requirement to interview the participants in order to further triangulate and corroborate their actions, interactions, intentions, thoughts and ideas. Brief, informal, conversational interviews were carried out throughout the recording process whenever the opportunity arose. These ranged from single questions to a series of open questions conducted either in- situ or during a break in proceedings. These brief interactions were more common during the later stages of the project when fewer members of the band were directly involved with the recording process and conversations away from the recording area were less detrimental to the participant-observation process.
Formal group and one-to-one semi-structured and unstructured interviews were conducted after the recording had finished (see Appendix 2 for dates and further details). For the first round of interviews a semi-structured design to the questions was adopted in order to gain specific information on the participants’ knowledge of the domain and field of rock record production (see Appendix 1 for an overview of these questions). The questions were designed to be adaptable and responsive and although as Priest (1996) argues, the construction of interview questions, the way they are ordered, framed and presented can all have a bearing on the sort of information
garnered, they nevertheless gave unprecedented access to the participants’ prior knowledge that was unobtainable in other ways (Rossman and Rallis 2003). The benefit of conducting conversational interviews during the process was that particular aspects and incidents could be substantiated and discussed almost immediately after the event. However, the formal semi-structured interviews that occurred a number of weeks and months after the recording process lacked some of the immediacy of these in-situ interviews. For this reason, selected sections of both the video and audio recordings of the sessions were used to help participants recall their actions, thoughts, ideas and intentions. Inviting the participants to comment on the video footage and sound recordings helped to corroborate personal observations and allowed the analysis to become, at times, collaborative.
The second round of interviews was used to allow participants to expand on some of the points they made in the first round of interviews. Unstructured questioning was employed as a way to encourage participants to discuss their creative processes without imposing any prior ordering, grouping or classification of questions (Punch, 1998). For this reason, the second round of interviews included mostly spontaneous questions that were related to the flow of interaction as the participants watched and explained the video footage (Patton, 2002). The questions were more focused on each individual’s processes to help gain an understanding of whether or not the interaction of the creative system’s elements related differently to particular roles inside the recording studio. Some common questions were asked during the second round of interviews though as this helped to corroborate answers from other participants that involved collaboration with the entire group (i.e. what is the process in the band of writing songs?). These unstructured interviews provided the opportunity to pose more probing and focused questions, add further detail to previous answers, and further develop a rapport with the participants. All interviews were recorded using a Sony ICDPX333 digital Dictaphone and transcribed for further analysis.