Mapping Process - Analogy Module - A Case-Based Approach for Reuse in Software Design

A Case-Based Approach for Reuse in Software Design

4.2 Analogy Module

4.2.2 Mapping Process

The second phase of analogy is the mapping of each candidate case with the query diagram, yielding an object list mapping for each candidate case. This phase relies on two alternative algorithms: one guided by relation mapping, and the other by object mapping, both return a list of mappings between objects.

Relation-Based Mapping

The relation-based algorithm (see figure 4.14) uses the UML relations to establish the object mappings. In line 1, the algorithm starts the mapping process by selecting a query relation based on the independence measure, which chooses the relation that connects the two most important diagram objects (the ones with the highest indepen-dence score). The MappingList is initialized to an empty list (line 2). Then it tries to find a matching relation on the candidate diagram, by getting the best relation from the query diagram (line 4), getting the best relation from the base case (line 5), and then mapping the object relations (line 6). After this mapping, it starts the mapping by the neighbor relations, spreading the mapping using the diagram relations (lines 7, 8 and 9). In the end it returns the list of mappings (line 10). This algorithm maps objects in pairs corresponding to the relation’s objects.

Object-Based Mapping

The object-based algorithm (see figure 4.15) starts the mapping selecting the most independent query object, based on the UML independence heuristic. After finding the corresponding candidate object, it tries to map the neighbor objects of the query

1. Relations ← Get best relation from the query diagram based on independence measure 2. M appingList ←?

3. WHILE Relations 6=? DO

4. P Relation ← Get best relation from Relations based on the independence measure 5. CRelation ← Get best matching relation from base case relations that are matching

candidates (structural constraints must be met)

6. M apping ← Get the mapping between objects, based on the mapping between P Relation and CRelation

7. Remove P Relation from Relations 8. Add M apping to M appingList

9. Add to Relations all the same type relations adjacent to P Relations, which are not already mapped (if P Relation connects A and B then the adjacent relations of P Relations are the relations in which A or B are part of, excluding P Relation) 10. ENDWHILE

11. Return M appingList

Figure 4.14: The relation-based mapping algorithm.

1. Objects ←Get object from the query diagram with the highest independence measure 2. M appingList ←?

3. WHILE Objects 6=? DO

4. P Object ← Get best object from Objects, based on the object’s independence measure 5. CObject ← Get best matching object from the base case objects that are matching

candidates (structural constraints must be met) 6. M apping ← Get the mapping between P Object and CObject 7. Remove P Object from Objects

8. Add M apping to M appingList

9. Add to Objects all the objects adjacent to P Object which are not already mapped (an adjacent object B to an object A, is every object that has a relation with A).

10. ENDWHILE 11. Return M appingList

Figure 4.15: The object-based mapping algorithm.

object, taking the object’s relations as constraints in the mapping. This algorithm and the one guided by relation mappings are described in detail, in section 4.1.3, where they are used as matching algorithms.

Mapping Ranking Criteria

Both mapping algorithms previously presented, satisfy the structural constraints de-fined by the UML diagram relations. But most of the resulting mappings do not map all the problem objects, so the mappings can be ranked using four different ranking metrics:

• Based on the number of mapped objects:

P Objs (4.25)

where K is the number of mapped objects, and P Objs is the number of objects in the problem.

CHAPTER 4. CBR Engine

• Based on the independence sum of mapped objects in the problem:

P_m

i=1Ind(MappedP Obj_i) P_n

j=1Ind(P Objj) (4.26)

where Ind(X) represents the independent heuristic of X, MappedP Obji is the ith mapped objects in the problem, P Obj_j is the jth object in the problem, m is the number of mapped objects in the problem, and n is the number of objects in the problem.

• Based on the independence sum of mapped objects in the problem and the case:

1 −

Where P Mi is the independence value of mapped problem object i. CMi is the independence value of mapped case object i.

• Based on the number of mapped objects and independence sum:

ω₁ · K

The KB Administrator can select the ranking that will be used. Some guidelines are provided in section 5.3.1.

Similarity Metric for Mapping Objects

An important issue in the mapping stage is: which objects to map? Most of the time, there are several candidate objects for mapping with the problem object. To solve this issue, we have developed a metric that is used to choose the mapping candidate.

Because we have two mapping algorithms, one based on relations and another on objects, there are two metrics: one for objects and another for relations. These metrics are based on the WordNet distance between the object’s synsets, and the relative position of these synsets in relation to the most specific common abstraction concept (see figure 4.16). This subsection describes the metric for ranking objects.

The next subsection describes the metric for ranking relations.

S55 S109 S2

S23 S11

S87

S76

MSCA

D(A,MSCA) = 3 D(B,MSCA) = 2

Figure 4.16: An illustration of the MSCA concept.

The similarity metric for mapping two objects (A and B) is based on three factors.

The first one is the distance between A’s synset and B’s synset in the WordNet on-tology (D₁). For the second factor, the Most Specific Common Abstraction (MSCA) between A and B synsets must be found (see figure 4.16 for an illustration). Consid-ering the distance between A’s synset and MSCA (D(A, M SCA)), and the distance between B’s synset and MSCA (D(B, M SCA)), then the second factor is the rela-tion between these two distances (D₂). This factor measures the difference between the level of abstraction of concept A and the level of abstraction of concept B. The last factor is the relative depth of MSCA in the WordNet ontology (D₃), which deter-mines the objects’ level of abstraction. The formulas for these measures are described in the following equations.

The similarity between A and B is defined as:







ω1· D1+ ω2· D2+ ω3· D3 ⇐ exists MSCA between A and B,

0 ⇐ otherwise.

where ω₁, ω₂ and ω₃ are weights associated with each factor. The values are selected based on empirical work and are: 0.55, 0.3, and 0.15. The factor D1, which measures the distance between A and B synsets, is:

D1 = 1 − D(A, B)

2 · DepthMax (4.29)

where DepthMax is the maximum depth of the is-a tree of WordNet. DepthMax in WordNet version 1.7.1 is 17.

CHAPTER 4. CBR Engine

For factor D2, we have that if A is equal to B then D2 is 1, otherwise it is:

D2 = 1 − |D(A, MSCA) − D(B, MSCA)|

pD(A, MSCA)²+ D(B, MSCA)² (4.30)

For factor D₃ we have:

D₃ = Depth(MSCA)

DepthMax (4.31)

where Depth(MSCA) is the depth of MSCA in the WordNet is-a tree.

Similarity Metric for Mapping Relations

The previous metric is used by the object-based mapping algorithm, where the best candidate object is selected for mapping with a problem object. For the relation-based mapping algorithm, the selection metric is not used for ranking objects, but for ranking relations. The selection metric for relations is based on the object sim-ilarity metric, but takes also into account the relations, and the objects linked by the relations. The metric for ranking mapping relations is defined by the following expressions:

• Suppose that the two relations involved are R₁ (relates A with B) and R₂ (relates C with D), and:

– MAB is the MSCA between A and B (see figure 4.17 for an illustration).

– MAC is the MSCA between A and C.

– MBD is the MSCA between B and D.

– MCD is the MSCA between C and D.

• If MAB, MAC, MBD and MCD exist, Then the value for the relation simi-larity metric is:

ω₁· ASim(A, C) + ω₂· ASim(B, D) + ω₃ · RASim(R₁, R₂) (4.32) where ASim(X, Y ) is the similarity metric for mapping objects X and Y (de-scribed in the previous subsection). Weights ω₁, ω₂ and ω₃ are: 0.25, 0.25 and 0.5.

MAB MAC MBD MCD

A B C D

R₁ R₂

Figure 4.17: An illustration of the most specific common abstractions used in the similarity metric for mapping relations.

• If (MAC and MBD exist) and (MAB and MCD do not exist), Then the metric value is:

ASim(A, C) + ASim(B, D)

2 (4.33)

• Else the metric value is 0, meaning that the relations have no similarity.

RASim(R₁, R₂) evaluates the similarity between R₁ and R₂, and is given by:

RASim(R₁, R₂) = ω₁·Length(R₁, R₂)+ω₂·Angle(R₁, R₂)+ω₃·Depth(R₁, R₂) (4.34)

Length(R1, R2) = 1 −|p

D(A, M AB)²+ D(B, M AB)²−p

D(C, M CD)²+ D(D, M CD)²| pD(A, M AB)²+ D(B, M AB)²+p

D(C, M CD)²+ D(D, M CD)² (4.35)

Angle(R1, R2) = 1 −

¯¯

D(A, M AB)

pD(A, M AB)²+ D(B, M AB)²− D(C, M CD)

pD(C, M CD)²+ D(D, M CD)²

¯¯

¯ (4.36)

Depth(R₁, R₂) = 1 − |Depth(A) + Depth(B) − Depth(C) − Depth(D)|

2 · DepthMax (4.37)

where ω₁, ω₂ and ω₃ are weights with values: 0.25, 0.25 and 0.5. Length(R₁, R₂) is a factor that reflects the distance between objects in the relations. It compares the distances between A − MAB and B − MAB with their counterparts, C − MCD and D − MCD. Angle(R₁, R₂) reflects the angle between the objects and the respective

CHAPTER 4. CBR Engine

MSCAs. Considering the trees A − MAB − B and C − MCD − D (see figure 4.17) it compares the disequilibrium between branches. Depth(R₁, R₂) reflects the depth in the WordNet, or in other words, the abstraction level of objects. Compares the absolute concept depths in the WordNet is-a tree.

In document A Case-Based Approach to Software Design (Page 141-147)