Knowledge Comparison - Knowledge Acquisition

Chapter 2 Literature Review

2.2 Knowledge Acquisition

2.2.3 Knowledge Comparison

The strong research and development focus on expert systems in the 1980s, and the identification of the difficulties involved in knowledge acquisition, also gave rise to research into the comparison and consolidation of knowledge. As there were a wide variety of knowledge modelling techniques available, this research generally focused on how to perform knowledge comparisons for each of those specific modelling techniques.

In 1989 Shaw and Gaines identified that when acquiring knowledge from multiple experts, each may describe different parts of their knowledge, use different terminology, or use terminology differently (Shaw & Gaines, 1989). They described four possible situations in acquiring knowledge from multiple experts: consensus, when the experts use the same terminology for the same concept; conflict, when experts use the same terminology for different concepts; correspondence, the use of different terminology for the same concepts; and contrast, the use of different terminology and different concepts. This scheme was applied to the knowledge of a group of experts, acquired as repertory grids: a technique which allows the definition of conceptual models by asking an expert to list what they considered to be the entities in their domain, then being asked to define distinctions between them (Fransella, Bell, & Bannister, 1979; B. R. Gaines, 1987; Shaw & Gaines, 1989). They concluded that any comparison of expert knowledge necessarily involves approximation, as evaluating a complete conceptual system is impractical: there

must be some level of assumption about underlying concepts, which may not in fact be identical. However, identifying significant similarities or differences is a valuable task as it promotes directed, contextual discussion among the experts that may reveal other more subtle distinctions (Shaw & Gaines, 1989).

Dieng in 1997 described a method for combining multiple experts’ knowledge when that knowledge is represented as conceptual graphs (Dieng, 1997). Conceptual graphs are a technique for visually representing knowledge: at the simplest level, by defining concepts as graph nodes and relationships as the links between them, but conceptual graphs can also represent first order logic, and contain rules as reasoning. A concept graph contains a set of concepts, a set of relationships, and a set of individual markers, which indicate when a concept is a named entity rather than a type of entity (Chein & Mugnier, 2008). Dieng’s study describes a detailed algorithm for how to combine multiple concept graphs, including comparing the concept set, relation set, and individual markers in turn, and identifying and resolving synonyms and homonyms in the names of the components (Dieng, 1997). A problem with concept graphs however is that they are difficult to develop, requiring significant work by a knowledge engineer in interviewing experts and attempting to elicit the conceptual models that the experts use.

Richards and Compton’s combination of Formal Concept Analysis (FCA) and Ripple Down Rules (RDR), also in 1997, could also be used to compare the concepts in different experts’ knowledge. The derived concept lattices of two knowledge bases could provide a visual representation of the concepts implied by each expert’s rules, allowing easier visual identification of their differences (Richards & Compton, 1997c). This method was shown to be effective in the identification of broad conceptual differences, for example when an expert defines classes which another did not (Richards & Compton, 1997c). However this method is less effective at identifying subtler differences between expert’s knowledge, and presents no information about the significance of each difference. For example, if two experts’ knowledge bases displayed a minor difference in the values used in certain rule conditions, say one expert used x<20% and the other x<25%, this could visually appear equally as significant a difference as one expert having an entirely new rule. The viewer also receives no information on the significance of these

differences: two knowledge bases may contain rules which use many subtly different conditions, yet almost invariably present identical results in practice. This is of course not a downfall in all circumstances: if examining how experts conceptually regard problems, the identification of those differences might present a significant result in itself. However, when comparing knowledge bases with a large number of differences, information on the significance of each difference may be needed to perform the comparisons efficiently and effectively.

Similarly, Beydoun and Hoffmann’s method for automatically integrating multiple knowledge bases is applicable as a knowledge comparison method (Beydoun, et al., 2005). However, while this method worked well for automatically combining knowledge bases (as much as is practical), it made no provision for resolving conflicts or improving expert knowledge. Making comparisons using all possible values for all attributes is also a concern, as the maximum ranges of attributes are not always obvious and modelling them could take considerable effort. Also, without considering the likely distribution of values for each attribute, the resultant comparison may misrepresent the significance of a difference: a small difference in value for one rule condition may conceivably result in 100 cases classified differently or none, depending on where within the distribution the condition’s value lies.

At approximately the same time, Vazey and Richards conducted studies into the application of the wiki paradigm to knowledge acquisition, whereby many experts can collaboratively update a central store of knowledge. In their approach, all parts of the knowledge base or cases could be edited or removed, with conflicts identified by tracking a history of these changes, or by marking the rules as accepted or in conflict. Identified conflicts were brought to the attention of the users who created the conflict, and resolved through online discussion and other users’ input (Richards, 2009; Richards & Vazey, 2005; Vazey & Richards, 2006).

The ICT support domain however presented some quite different features to other application domains of MCRDR. The most fundamental difference is that each of the users have relatively little knowledge specific to each ICT problem, with their expertise focused primarily on general problem solving skills: a survey of users indicated that for 67% of cases the user would not have the knowledge to resolve the case, and would need to refer to other sources (Richards & Vazey, 2005). The

focus of the development therefore was to incorporate these other sources into the central knowledge base, making the knowledge base the primary source of knowledge. Thus, the most important goal of knowledge acquisition in the ICT support domain is to make the knowledge base as complete and correct as possible, without particular concern for the users’ knowledge, as it is assumed that they will be retrieving their knowledge from the knowledge base. This subtly contrasts with the goal in other applications of MCRDR, such as the medical domain considered in this study, where the goal is to support an expert’s knowledge and decisions rather than present authoritative solutions (Musen, Shahar, & Shortliffe, 2006). A further difference in the domain is in the outcome of a case. In ICT support, a case is correctly resolved once the problem is corrected. Unless the problem subsequently recurs, the solution can be said to be correct regardless of what the solution may have been. This does not always apply in other domains however. In a medical interpretation setting, the resolution of a case is often ambiguous: different experts may well provide different interpretations, and there is often no conclusive evidence as to which interpretation is correct. The consequent of these differences is a focus on allowing knowledge to be collaboratively corrected, but little work on how to assist in that resolution. This accurately models the Web 2.0 paradigm and was shown to work in the ICT support domain, but is impractical for a domain such as medicine where conflicts in knowledge may appear without obvious solutions, especially without wide ranging collaboration.

In document A method for knowledge discovery and development with health data (Page 54-57)