3.3 Error Detection and Recovery
3.3.3 Identification and Recovery
3.3.3.3 Problem Solving and Performance
Skilled performance requires constant switching between skill, rule and knowledge based activity. When activities are routine and familiar, they proceed in a largely automatic fashion, with periodic checks to ensure that intentions are being met. Within the GEMS model, if a check identifies a threat to meeting intentions, rule-based problem solving will take place. If a rule is found that matches conditions, it will be enacted and activity will return to skilled performance. If no rule-based solutions are found, the model suggests that an effortful, conscious, knowledge-based problem solving process will be undertaken.
The initiation of knowledge-based reasoning does not preclude ongoing searches for patterns out of the rule repertoire. Rapid switching can occur between knowledge- and rule- based activities in order to form and execute a recovery plan in local problem
solving, a process of establishing local goals that can be carried out and assessed. In this
case, routines or rules will be borrowed from other established activities.
3.3.3.4 Awareness
The switch to knowledge-based problem solving can be influenced by feelings of uncer- tainty or worry (Reason, 1990) factors that are described by other researchers in terms of
awareness. Allwood examined this in the context of evaluations of completed work that
are undertaken based on the suspicion that something is wrong, while the Italian re- searchers describe it as mismatch emergence, which is coupled with the understanding that one is responsible for the erroneous action (Rizzo, Ferrante, & Bagnara, 1995).
Stable Frames: Expectations and
assumptions about intended ac- tions are not changed during per- formance. The frame of reference remains the same.
Shifting Frames: Knowledge is
updated after executing actions. The frame of reference changes after completing an action, and original expectations are adjusted in terms of outcomes.
Distant Frames: The active
knowledge relates to a context distant either conceptually or in time or both from the erroneous action.
I did some computations with a calculator. I manip- ulated the data by following a formula kept in my mind. The final result did not seem correct to me. I remade the computation two more times and both results were the same, but different from the first. These latter results sounded right to me. Actually, I did not discover what the error was but only that I had made an error. (E11)
I had to make many Xeroxes in the shortest time. I prepared the sequence of articles. I put the sheets over the machine and collected the copies in order to rearrange them in "papers". I was Xeroxing a long paper when I noted that I had to reorder all the copies, because I was feeding from the first page on. Then, I realised that starting from the end of the paper would have spared time and work. (E12) I decided to clean the luggage rack of my car. To ease the access to the hollow I removed the rear panel bus. Since in the panel there were the speakers of my stereo, I disconnected cables. The day after I turned on my stereo car: the left speakers did not work. I thought it was a fault in the system. One week later, I was in the car talking with a friend the possible causes of the left speaker’s breakdown. I recalled that some days before I had my car in a garage to fix minor faults. The guys in the station could have forgotten to re-connect some electric cables. Then, suddenly, I remembered that I had put my hand on electric cables too… (E13)
Table 3.2 Frames of reference during action. Adapted from (Rizzo et al., 1995).
The Italian researchers also describe awareness during error handling in the context of active expectations, the frame of reference held by a person performing an action. Though it shares features with Reason’s description of local problem solving or Norman’s description of the activation of schema based on the “goodness” of matches (1981), the notion of frame of reference was developed to counter the prevalent view in error detection studies that knowledge is static, or that all knowledge necessary to complete a task is “always available and ready to be used” (Rizzo et al., 1995, p. 8).
Instead, they argue that knowledge, and by extension the active frame of reference, is updated during interactions with an environment. Internally, this is done through the selection of alternative knowledge that is “more appropriate” and externally through new knowledge generated by assessments made of the changing state of the environment. Drawing on related research in psychology, the authors identified four frames of reference, for which they provided examples from their data, given in Table 3.2.
3.4 Summary
The causes of accidents are present in a system long before catastrophe occurs, or a clear sequence of events leading to the accident can be established (Reason, 1990). The notion of latent problems is familiar within software engineering, perhaps made prominent most famously by Brooks. In his description of of software development projects, disastrous schedule slippage is gradual, due to “termites, not tornadoes” (Brooks, 1995, p. 154).
Lack of meaning. There is no goal governing the ongoing activi- ty.
I intended to pick up the keys of the car. They are usually in a box at the entrance. Instead, I entered another room and searched in a drawer where I did not find any keys (but there were the documents about which I was talking before, but I did not pay attention to them). Then I found myself wondering what I was looking for and why I was there. I had to go back to my office before to recall that I was leaving and so I needed the keys of my car. (E14)
some latent conditions should be able to be spotted and fixed (Reason, 1990). However, within software development projects, “day-to-day slippage is harder to recognise, harder to prevent, harder to make up.” (Brooks, 1995, p. 154).
Disastrous events are likely never to occur again and so it is necessary to look beyond failed outcomes and to examine the particular details of situations (Rasmussen, 1990). Researchers need to understand how informants select and view events (Crandall et al., 2006). The key is to understand error from an informant’s perspective, to reconstruct the view they have when encountering things that go wrong by “standing” in the same situation. The emphasis placed on “local rationality” retrains analytical focus from judgement to dynamic factors that influence performance, including knowledge, mind set and goals (Woods & Cook, 1999).
Software development has been shown to include kinds of work associated both with active and latent error categories (Curtis, Krasner, & Iscoe, 1988; Pennington & Grabows- ki, 1990). Work is required that cuts across different kinds of tasks, and must be performed in response to higher level organisational concerns. If error is studied in the context of the space of possibilities within which developers perform and not in terms of the overt effects their actions have on software performance, the lens is shifted, from ends that might include critical failure or costly redevelopment to the means that make up everyday practice.
This view affords at once a narrower and a broader perspective. It is narrow in that focus is taken from software in operation, from the organisational or methodological environment in which it is produced, and from the artefacts of which it is comprised. Focus is given to the actions performed by individual developers to create software. It is broader in that analysis of individual actions permit a more general examination of error in the
context of work at the desk, but also in other contexts that depend on different kinds of tools, and that produce different outcomes.
For as long as there has been software engineering, there has been error. It is a defining marker, transcending nations, regions and organisations. Research and methodology have been devoted for decades to eradicating, to minimising, to preventing it. Tools are built to manage error; records are kept to track it. Circles of people form in teams, in departments, in companies and governments to look for error, to talk about it, to plan for it, to fume and worry. Individual developers spend hours and hours hunting errors down and getting rid of them.
The problem of error in software development lies with the people who make it. Developers tinker, they are incompetent, un- or improperly skilled, they do not adhere to process. If only developers would build correct software, error would go away. If only they could design and build the right defences, error could be tolerated and the problem of error would go away. If only designs were better, requirements more clearly defined, if develop- ment tools were better and easier to use…If methodology and practice were more social, if only developers were better trained, they could get a handle on it, and the cost of error would go down.
This reduction of decades of software engineering research and commentary into two sensational paragraphs was written to provoke a sense of unease (Hammersley & Atkinson, 2007), of strangeness about the relationship between developers and error. Everyone “knows” that the problem of error in software is people, however little is understood about what developers on the job make of it. An ethnographic stance has been taken to explore their perspective. The following pages of this chapter explain what this means.