Knowledge in CBR Systems - Textual Case-Based Reasoning

Textual Case-Based Reasoning

2.3 Knowledge in CBR Systems

Informally, knowledge can be regarded as something that is needed during prob-lem solving or decision making. Several types of knowledge can be distinguished with respect to its nature or use. Particularly important to CBR systems is the distinction based on the nature of knowledge, a distinction made by [Richter,1998]:

Background knowledge – general and problem independent knowledge.

Contextual knowledge – domain specific knowledge.

Episodic knowledge – a narrative of something that happened in the past.

A unique characteristic of CBR (not shared by other types of knowledge systems) is that it incorporates episodic knowledge in the form of cases.

In order to use these types of knowledge within a CBR system, knowledge needs to be accessible. By structuring and storing knowledge in the so-called knowledge containers [Richter, 1995], it is possible to make knowledge reusable for several applications.

In general, there are four knowledge containers in a CBR system:

• the vocabulary,

• the similarity measures,

• the case base, and

• the adaptation knowledge.

Similarity Measures

Case Base Vocabulary

Vocabulary

Adaptation Vocabulary Knowledge

Figure 2.4: The four knowledge containers of a CBR system. Source [Roth-Berghofer and Cassens,2005]

A graphical representation of the relations among these containers is shown in Figure 2.4. Vocabulary, shown as the outer layer, serves as a basis for all the other containers. Indeed, the vocabulary defines objects of a domain, their attributes² and values, as well as different types of relationships among these objects. A case will be represented by using attributes, values, and relations that are part of the vocabulary.

To better understand the nature of the knowledge containers, consider the CBR system CHEF, described in broad lines in Figure 2.5. The vocabulary for this sys-tem will contain knowledge that describes objects and relationships in the culinary domain, necessary for representing the cases (i.e., the recipes). An example is a hier-archy that categorizes ingredients according to their substantial nature: beef, pork, and lamb are kinds of meat; broccoli, green bean, and egg-plant are kinds of veg-etables; or mango, avocado, and pineapple are kinds of exotic fruits. Another part of the domain knowledge consists in the measurement units for the different types of ingredients: lb. (pound), tablespoon, piece, or chunk; which can be regarded as attributes of ingredients.

In order to retrieve an existing recipe when a new query is presented to the system, the retrieval process needs similarity measure knowledge, stored in the

2Often, attributes are referred to as features.

CHEF - A case-based reasoning system [Hammond, 1989].

CHEF is one of the earliest and more sophisticated CBR systems reported in the literature. The following description is based on [Kolodner,1993].

General Description:

CHEF’s goal is the creation of culinary recipes by taking into account the ingredients and dish qualities desired by the user. CHEF has at its disposal a library of existing recipes, which are used as basis for creating new recipes. Furthermore, CHEF has a simulator that “tries out” the recipe, in order to detect possible failures and adapt the recipe accordingly.

Case-base:

The case-base contains recipes. The recipe Broccoli with Tofu follows:

Problem:

act1 (chop object (ingr1) size (chunk)) ...

act6 (stir-fry object (result act5) time (2))) (style stir-fry)

CHEF at Work RETRIEVER:

Searching for plan that satisfies --Include beef in the dish.

Figure 2.5: Some details on the CBR system CHEF

corresponding container. Depending on the type of the attributes used for case representation, different kinds of local similarity³ measures might be needed. If the attributes are numerical, some domain-independent similarity measures, such as the Euclidean distance, can be used. However, if the attributes have symbolic values, a domain-dependent similarity measure is needed. Such a measure could be based upon a semantic taxonomy like the hierarchy of ingredients mentioned previously.

Indeed, the retrieval scenario shown in Figure 2.5 indicates that, when a request for a “beef and broccoli” recipe is presented, the system retrieves a “beef and green beans” recipe, because both broccoli and green beans are vegetables. Moreover, the similarity measure container can also store knowledge that defines the importance (weight) of each attribute in the global similarity measure. For example, when preparing the recipe of a main dish, the presence of a meat ingredient in a retrieved recipe can be weighed more than the presence of a vegetable ingredient.

Very often, the retrieved case cannot be used in its original form. That is, the old solution needs to be adapted to the new problem. The knowledge needed for transforming the solution is known as adaptation knowledge and is stored in a separate container. There are several types of adaptation knowledge, in accordance with the goals of the CBR system. For the scenario in Figure 2.5, the simplest adaptation is to replace everywhere in the recipe the ingredient ‘green been’ with

‘broccoli’. However, because CHEF is a sophisticated system, it uses other types of adaptation, too. For example, to the ingredient ‘broccoli’ is attached a so called critique (a kind of rule), which fires up when ‘broccoli’ is included in a recipe. In such a situation, the critique adds a new step to the recipe, which for the mentioned ingredient is: “Chop broccoli into pieces the size of chunks.” Then, if during the simulation of the recipe, some failures occur, this piece of knowledge is also added to the repository, in order to anticipate that kind of failure during successive uses of the recipe.

As it can be noticed from this brief description of the CHEF system, a complete CBR system requires many kinds of knowledge besides the cases in the case base.

Providing this knowledge is not always easy, therefore, many CBR systems try to succeed with that much knowledge that is already available. This is particularly true for TCBR systems, where knowledge for the knowledge containers is more difficult to obtain, as it will become clear in the course of the following Section2.4.

In document Knowledge Extraction and Summarization for Textual Case-Based Reasoning: A Probabilistic Task Content Modeling Approach (Page 30-33)