In this study, we have defined a generic structure of information (a taxonomic model) that is postulated to be a sound basis for defining similarities between incidents like those described in ASRS-like aviation incident reports. On the basis of this structure, we have introduced the simplifying structure of the Scenario as a pragmatic guide for identifying similarities of what happened based on the objective parameters that define the Context and the Outcome of a Scenario.
We assume that it is possible to design an automated clustering process guided by the structure of the Scenario, and that the results will be easy for human experts to understand. We have identified the “full and complete” set of parameters that define the Context of the initial safe state, and the anomalous Outcome. Our assumption is that this complete set of parameters adequately describes what happened. Automated tools will use the values of these parameters to identify the Scenario and to cluster similar scenarios from the ASRS database based on what happened. We have
demonstrated the potential of this approach in the experiments described in this report.
The limited experiment of the “Case Study” discussed in Section 7 showed, within the limitations of the small number of reports used, the value of the Scenario model for clustering reports based on similarities of Context plus Outcome. Moreover, the rough codification of the why for this small set of reports showed that misrepresentation was a common factor and identified some subjective parameters that can be contributing factors to Behavior. This experiment encouraged us to continue with our approach to analyzing free text for information on why an incident occurred.
Then we used our current automated capabilities to cluster the objective parameters as they are coded in the current ASRS database. We considered the dominant cluster to be representative of the Context of each Scenario, and determined that there are certain common dominant factors associated with each anomalous Outcome. We cross-tabulated the data set using ten identified Contextual Patterns as the rows and ten chosen anomalous Outcomes as the columns. We then computed the ratio between the observed number of observations in each cell and the statistically expected number of observations. We concluded that relationships that are both statistically and operationally
meaningful exist between Contextual Factors/Patterns, on the one hand, and specific types of unwanted aviation safety Outcomes, on the other. We recognized that the multiplicity of contextual factors that may be present during aviation safety events creates analytical challenges (i.e., the dimensionality needs to be reduced through recurrent pattern identification).
This report has presented a first-generation process for routinely searching large databases of aviation accident or incident reports, and consistently and reliably analyzing them for objective factors (the what of an incident) as well as the causal factors of human behavior (the why of an incident). We have proposed a method for applying the paradigm of Situational Awareness—with its five components of Detection, Recognition, Interpretation, Comprehension, and Prediction—to automated clustering on the objective and subjective parameters associated with erroneous human Behavior from the free-text narrative of an incident report. Noting that the discriminating
components of SA have a natural sequence, or chronology that can be linked with event chains, we have postulated the possibility of identifying effective interventions for the elements of human error identified in incident and accident data.
We have assumed a very simple model for describing the human behavior associated with the transition to an anomalous state in our concept of the Scenario. There are likely other factors besides loss of SA that could influence transitions in some scenarios. However, the research literature documents the very high frequency with which human error can be related to loss of SA. Certainly, not all of the contextual factors of the last safe state prevail unchanged throughout the transition, and those changes both influence and are influenced by the human actions on the system. Also, it is clear that human cognition is a cyclic process and not the simple sequential process of our DRICP
framework. Nevertheless, we maintain that our simplified model of Scenario and Behavior is both necessary and justifiable in this first generation of automated analyses of free text. It is necessary to keep the analysis tractable within currently available capabilities, and it is justifiable because there is every reason to believe that ASRS reports are usually delivered as sequential narratives. The
research process will be designed to continuously question our assumptions, and our simplifications will be corrected as required through future investigations.
The plan is to continue to develop and enhance the automated capability to correlate Context and Outcome by incorporating additional domain knowledge. For this first-generation process, we
believe that it is essential to (1) maximize the information from the objective parameters about what happened in order to minimize the domain for analyzing why it happened, and (2) assume a
simplified model of Behavior to begin to analyze automatically for an understanding of why. In the experiment to be conducted during the next year, we will evaluate the ability to automatically extract useful information about why a set of similar incidents occurred based on this simplified model.