The result obtained from a match operation is a mapping specifying the matching ele- ments between two input schemas.7 As already indicated in Figure 5.2b, we consistently capture each pair of matching elements in a single correspondence, i.e., 1:1 local cardi- nality, together with a similarity value between 0 (strong dissimilarity) and 1 (strong similarity) to indicate the plausibility of the correspondence. However, one element may occur in multiple correspondences so that n:m match relationships (global cardinality) are also possible, as, for example, firstName↔Name and lastName↔Name. The uniform representation of match results as sets of 1:1 correspondences makes it easier to combine match algorithms with different or varying result cardinalities.
All generated mappings are maintained in memory by the Mapping Pool. Like the Schema Pool, the Mapping Pool provides several functions to perform common manage- ment tasks on mappings, such as to load a mapping from the repository into memory, to persistently save a mapping from memory to the repository, and to import and export selected mappings. Moreover, inspired by the model management approach [12, 14, 94], we have implemented in the Mapping Pool various operators for automatic mapping manipulation. Table 5.2 shows an overview of these operators, which are briefly described in the following:
7. We use ↔ to denote a (similarity-based) mapping between two schemas, e.g., m: S1↔S2 or simply S1↔S2, or a single correspondence between two elements, e.g., c: s1↔s2 or simply s1↔s2 with s1∈S1
Table 5.2 Operations on mappings Operation Description/Output Transpose(m: S1↔S2) {s2↔s1 | s1↔s2 ∈m} Domain(m: S1↔S2) {s1 ∈S1 | ∑s2 ∈S2 ∧ s1↔s2 ∈m} InvertDomain(m: S1↔S2) S1 \ Domain(m: S1↔S2) RestrictDomain(m: S1↔S2, S) {s1↔s2 | s1↔s2 ∈m ∧ s1∈S} SchemaMerge(m: S1↔S2) S1∪ InvertRange(m: S1↔S2) MappingMerge(m1: S1↔S2, m2: S1↔S2) {s1↔s2 | s1↔s2 ∈m1∨ s1↔s2 ∈m2} Diff(m1: S1↔S2, m2: S1↔S2) {s1↔s2 | s1↔s2 ∈m1 ∧ s1↔s2 ∉m2} Intersect(m1: S1↔S2, m2: S1↔S2) {s1↔s2 | s1↔s2 ∈m1 ∧ s1↔s2 ∈m2} MatchCompose(m1: S1↔S, m2: S↔S2) {s1↔s2 | s ∈S ∧ s1↔s ∈m1 ∧ s↔s2 ∈m2}
Compare(m1: S1↔S2, m2: S1↔S2) Quality of m1 with respect to m2 containing correct corre-
spondences
5.5.MA P P I N G RE P R E S E N T A T I O N A N D MA N I P U L A T I O N 5 1
• Transpose: This operator swaps the source and target elements in all correspondences of the input mapping. This inversion is possible as the correspondences do not have mapping expressions, but only similarity values, which are assumed to be undirec- tional.
• Domain and InvertDomain: Given a mapping, Domain returns the source elements involved in one or more correspondences. On the other hand, InvertDomain returns the source elements, which are not involved in the correspondences of the mapping.
• RestrictDomain: This operator takes a mapping and a set of elements as input and returns those correspondences of the mapping with the source element contained in the element set.
• SchemaMerge: This operator takes as input a mapping and merges the element sets of the input schemas. In particular, it takes the element set of the source schema and fur- ther adds those non-matching elements of the target schema. The non-matching ele- ments to be added are identified using InvertRange, the counterpart of InvertDomain, returning the target elements not involved in the mapping, i.e. not matched.
• MappingMerge: This operator takes as input two mappings involving the same source and target schemas and produces a new mapping with the unique set of correspon- dences contained in the either mapping. Two correspondences are regarded as equal if both their source and target elements are the same, respectively. Otherwise, the corre- spondences are considered distinct.
• Intersect: This operator takes as input two mappings involving the same source and tar- get schemas and produces a new mapping containing the set of correspondences con- tained in both input mappings. It is based on the same equality notion of correspondences as MappingMerge.
• Diff: This operator take as input two mappings involving the same source and target schemas and produce a new mapping containing those correspondences contained in the first mapping but not in the second one. It also employs the equality notion of cor- respondences introduced for the MappingMerge operator.
• MatchCompose: Assuming the transitivity of similarity relationships, this operator per- forms a join-like operation on two or more mappings, such as S1↔S2, S2↔S3, succes- sively sharing a common schema, to derive a new mapping between S1 and S3. It represents the main mechanism for reusing previously identified match results to solve a new match task and will be discussed in detail in Chapter 6.
• Compare: This operator estimates the decency of a test mapping against an expected/ real mapping according to different quality measures. With the expected mapping con- taining all correspondences that should be found, this operator can be used to evaluate the results of automatic match operations. The quality measures for mapping compari- son will be discussed in detail in Chapter 10.
Note that Domain, InvertDomain, RestrictDomain, and SchemaMerge are defined with respect to the source schema of the input mapping. Their counterparts, such as Range, Inver- tRange, and RestrictRange, concerning the target schema can be easily defined with help of the Transpose operator, such as Range(m) = Domain(Transpose(m)). Except for Com- pare, all operators yield either a set of schema elements (Domain, InvertDomain, Schema- Merge), or a set of correspondences (RestrictDomain, MappingMerge, Intersect, Diff, and MatchCompose). While the correspondences are simply stored in a new mapping, we sup- port transforming the set of schema elements to a new schema, so that it can be further be matched and manipulated like other schemas. This is done by preserving structural rela- tionships between the elements and their ascendants to make the new schema structurally consistent with the input schemas (see Section 18.3).
The new schemas and mappings generated by the operators are automatically added to the Schema and Mapping Pool, respectively, for visualization on the GUI and further manipulation. Besides supporting user interaction, all mapping operations can also be utilized in match strategies to define workflows of match processing.