Evaluation of Object Matching - Developer Evaluation of MatchMaps

11.1 Developer Evaluation of MatchMaps

11.1.2 Evaluation of Object Matching

In this section, we report the use of MatchMaps to match OSM data (building layer) [OpenStreetMap, 2014] to OSGB MasterMap data (Address Layer and Topology Layer) [Ordnance Survey, 2014a]. The study areas are in city centres of Nottingham and Southampton, UK, as shown in Fig. 1.1 and Fig. 11.1 re- spectively. The Nottingham data was obtained in 2012, and the Southampton data in 2013. The numbers of spatial objects in the case studies are shown in Table 11.4. The number of OSM objects is smaller in each case, because OSM

data often describes a collection of OSGB objects as a whole, for example, OSGB shops as a shopping centre in OSM.

FIGURE11.1: The geometric representations of Southampton city centre from OSGB (left) and OSM (right)

TABLE11.4: Data used for Evaluation

OSM spatial objects OSGB spatial objects

Nottingham 281 13204

Southampton 2130 7678

We chose these two datasets for evaluation because they have a reasonable rep- resentation in OSM (city centres usually attract more attention from OSM con- tributors, and a variety of buildings and places are represented there) and are of reasonable size. In both cases, we set the value of σ used in geometry matching to be 20 metres.

The main objective of evaluation was to establish the precision and recall of MatchMaps. Given the size of the case studies, it was infeasible for domain experts to produce a complete set of ground truth matches manually. Instead, we computed the ground truth as follows. For each OSM object a, we check all matches which involve a (either a single sameAs(a, b) match with some b in the OSGB dataset, or several partOf matches involving a) produced by

TABLE11.5: Matching OSM spatial objects to OSGB

TP FP TN FN Precision Recall Nottingham 177 19 64 21 0.90 0.84 Southampton 1997 21 71 41 0.98 0.97

MatchMaps. If the match or matches were determined by a human expert to be correct, a was classified as ‘Correctly Matched’ (True Positive or T P ), otherwise it was classified as ‘Incorrectly Matched’ (False Positive or F P ). For a ∈ F P , a check was made whether a correct match for a existed; if yes, a was labelled F Psbm. If a was not involved in any matches, a check was made whether a cor-

rect match for it existed. If there was no correct match, then a was placed in ‘Correctly Not-matched’ (True Negative or T N ), otherwise in ‘Incorrectly Not- matched’ (False Negative or F N ). Straightforward matches were checked by a non-expert using guidelines developed in conjunction with a subject matter expert from the Nottingham Geospatial Institute. A subject matter expert at Ordnance Survey (Great Britain’s National Mapping Authority) classified non- straightforward cases (approximately 10% of the total output of the system for the given datasets). In this way, OSM spatial objects in the Nottingham case and the Southampton case were classified into categories, as shown in Fig.11.2. Note that the size of each group is the number of OSM spatial objects in it. For example, for the Victoria Centre in OSM, though there are hundreds of partOf matches involving it, it is only counted as one element in ‘Correctly Matched’. Precision was computed as the ratio of |T P | to |T P | + |F P |, and recall as the ratio of |T P | to |T P | + |F N | + |F Psbm|. As shown in Table11.5, for both Not-

tingham and Southampton cases, precision is ≥ 90% and recall ≥ 84%.

Most OSM spatial objects in the ‘Incorrectly Matched’ category were incorrectly stated as being partOf some other spatial objects nearby. It is difficult to prevent such mistakes because spatial objects and their parts may not have any similar lexical information and therefore partOf matches are generated mostly based

FIGURE 11.2: OSM spatial objects of the Nottingham case (left) and the

Southampton case (right) are classified into four categories: TP (Black), FP (Red), TN (Yellow) and FN (Green).

on geometry matching. Though the generated matches will be verified using reasoning in spatial logic and description logic, not all mistakes can be detected. For example, wrong partOf matches will not be detected by spatial logic, if spatial objects involved in them are all close to each other. Some wrong partOf matches cannot be detected by description logic, because several OSM spatial objects do not have any type information and the use of description logic for verifying consistency of partOf matches is limited by a small set of manually generated ‘partOf -disjointness’ statements.

MatchMaps failed to match OSM spatial objects in the ‘Incorrectly Not-matched’ category, mainly because its lexical matching method cannot match different names (represented by non-similar strings) of the same spatial object. For example, the OSGB spatial object labelled as ‘Nottinghamshire Constabulary, Police Services’ and the OSM spatial object labelled as ‘Central Police Station’ cannot be matched but they actually represent the same object in the real world. We compare the performance of MatchMaps with two ontology matching (in- stance matching) systems, LogMap [Jiménez-Ruiz and Grau, 2011] and Kno- Fuss [Nikolov et al., 2007a], for matching the study area in Nottingham city

TABLE11.6: Comparing sameAs matches generated by MatchMaps, LogMap

and KnoFuss (Nottingham case)

MatchMaps LogMap KnoFuss number of sameAs matches 115 119 102

number of correct matches 115 28 18

precision 1 0.24 0.18

recall 0.84 0.20 0.13

centre. We did not compare MatchMaps to geometry matching systems because MatchMaps uses standard geometry matching techniques (See Chap- ter 5). More advanced geometry matching methods may work better than MatchMaps for matching geometries, but for matching spatial features with meaningful labels, they do not make effective use of lexical information and do not verify consistency of matches using spatial logic as MatchMaps does. We only compare the generated sameAs matches, as LogMap and KnoFuss do not generate any partOf matches, but the evaluation of MatchMaps using the whole ground truth (containing both sameAs matches and partOf matches) has also been provided in Table 11.5. In the ground truth established above, 137 sameAs matches should be generated. As shown in Table 11.6, the precision and recall of MatchMaps are much higher than those of LogMap and KnoFuss. This is mainly because LogMap and KnoFuss do not make effective use of loca- tion information.

In document Matching disparate geospatial datasets and validating matches using spatial logic (Page 172-176)