Database Examples - Same Symbols, But Different Meanings

1.5. Same Symbols, But Different Meanings

1.5.2. Database Examples

Consider two very simple relational databases, where the only relation that is modeled is the spatial relation in, the only entities under consideration are Route 2, Orono, Bangor, and Maine, and there are no attributes for any of these entities. We assume the databases have the

following information in common:

1. the ‘real-world’ domain, whose entities are Route 2, Orono, Bangor, and Maine;

2. the data domain:{Route2, Orono, Bangor, andM aine};

3. the data model — in this case, the relational data model;

4. the database schema, which consists of a single binary relation ‘in’, without attributes

(i.e., there is no special name given to the set of elements that could be in the first or second position of the tuples for in); and

5. the type of the data (character data).

Two different people might plausibly create the databases DB1 and DB2 (Figure 1.5),

according to their understandings of what the spatial relation in means. The reader should

not consider these databases to be simplified versions of Geographic Information Systems (GISs) or spatial databases (Rigaux et al., 2002), which specify or derive spatial relations via coordinate systems or via topological relations of spatial parts.

The examples in this section serve to illustrate the central ways in which an ontology differs from a conventional database: (1) the ontology specifies axioms in addition to data; (2) the semantics of an ontology typically commits to more than one fixed ‘way the world could

be’; and (3) the analysis of the semantics of an ontology is inextricably linked to the idea of ‘more than one way that the world could be.’

ConsiderDB1 (Figure 1.5). The person who createdDB1 has a notion of in that does not

include either the relationship “Route 2 is in Orono” or that “Route 2 is in Bangor,” although this person’s notion of in does include the relationships “Orono is in Maine,” “Bangor is in

Maine,” and “Route 2 is in Maine.” (Such an argument glosses over considerations of the open- world and closed-world assumptions (Reiter, 1978), which are discussed further in Chapters 2 and 3.)

DB1: Tuples forin(x, y) DB2: Tuples forin(x, y)

(Route2, Orono) (Route2, Bangor) (Orono, M aine) (Orono, M aine) (Bangor, M aine) (Bangor, M aine) (Route2, M aine) (Route2, M aine)

Figure 1.5: Two databases that model in differently

The difference in the two persons’ notions of in is thus reflected, albeit imperfectly, by the different sets of tuples used to populate their respective databases. More specifically, the

elements in the domain of the database, plus the data tuples in the in relation constitute a model (Manzano, 1999; Vianu, 1997) of the part of the real world under consideration. Once people create these tuples in the database, they effectively turn over any treatment of the

meaning of in to the system that works with these tuples. Consequently, from the perspective of the database management system, the two different meanings of in are determined solely by the data elements and the tuples themselves. Whether or not the resulting databases can

Regardless of whether the two people who createdDB1andDB2have accurately captured

their intended meanings of in, so far as the two databases are concerned, the meaning of in is

determined solely by the set of tuples in their respective relational tables for in, i.e., by their models of in. In this example, the tuples themselves, along with relational symbol in, comprise the only machine-processable information available for processing the meaning of in. There

may be myriad other senses of ‘in’ that are not captured by this specification, but the point here is that the only thing a computer has to work with is the specification it is given, along with the rules for processing it.

Saying that two databases mean different things by the same relational symbol amounts to saying that the databases have different sets of tuples in their respective tables for this

relational symbol. Schematically, this situation is depicted in Figure 1.6.

same relational symbol: in my idea

of in

single set of tuples

DB 1

same external reality

my idea of in different notions of in different models of in DB 2 single set of tuples

The following assumptions underlie the framework of the above analysis of differences betweenDB1 andDB2.

• The two people are modeling the “same” external reality.

• The two people have their own individual notions of what the spatial relation in means.

• They use the same symbol—in—to represent this spatial relation in their databasesDB1

andDB2.

• The set of tuples in one database may or may not overlap with the set of tuples from the

other database. In the example of Figure 1.5,DB1 andDB2have different, overlapping

sets of tuples, which reflect the different notions their creators have of the spatial relation in.

• When one considers the differences in the specifications of the databases, one is no longer dealing with the notions of in that the two human modelers have. Rather, the only information that is available about the meaning of in is that which can be gleaned

solely from the databases themselves. When the focus is on the databases themselves and not on what the human modelers may have intended, the set of tuples from the in table becomes the only tangible artifact available to determine what in means. That

is, once the human modeler is out of the picture, the set of data tuples of a database essentially is the meaning of the relation in.

• The set of tuples inDB1 orDB2 is the model of the world that each person has created,

whether or not this model faithfully depicts all the nuances of the relation in that the database creator might have.

DB1 and DB2 are semantically heterogeneous because even though they use the same

symbol to describe the relation in, they mean different things by it. Given that DB1 and DB2 are semantically heterogeneous, to what extent might these databases nevertheless be

semantically interoperable? There are many plausible ways to answer this question, among which are that

• their sets of tuples overlap;

• they give the same answers to the same questions;

• queries to the databases do not involve Route 2; and

• queries to the databases do involve Maine.

What these plausible answers show is that in assessing to what extent two such databases might be semantically interoperable, it makes sense to consider both the sets of tuples in the databases and the queries put to the databases.

To determine to what extent two ontologies (as opposed to two databases) are semantically interoperable, one also needs to consider the axioms of the ontologies and the models that re-

sult from including axioms along with data. The next section introduces these considerations. A fuller discussion of some needed technical background is given in Chapter 2.

In document Semantic Interoperability of Geospatial Ontologies: A Model-theoretic Analysis (Page 30-34)