1.4 Structure of this Dissertation
2.1.1 Interaction in Content-Based Pub-Sub Systems
One might look at the concept of content-based pub-sub systems as being similar to database management systems. Although one can argue for this perspective, another viewpoint could contrarily describe content-based pub- sub systems as the opposite of database management systems. We do not want to take sides here because both positions contain legitimate facts, as presented in the following analysis.
Interaction Patterns in Content-Based Pub-Sub Systems
In content-based pub-sub systems, one can find four different interaction pat- terns. We introduce them in the following paragraphs. Afterwards, we link these concepts to database management systems.
Registering and Deregistering Subscriptions. Similarly to database management systems, content-based pub-sub systems allow their users to de- fine queries. These queries are referred to as subscriptions and need to be registered with the pub-sub system before their evaluation. Subscriptions are defined with the help of a subscription definition language (also referred to as subscription language).
The set of all subscriptions is denoted by S, a particular subscription set by Si(Si ⊆ S), and an individual subscription of this set by s ∈ Si. Users, reg-
istering such subscriptions, are referred to as subscribers, individually denoted by S. The subscriber of a subscription s is abbreviated by S(s).
Subscriptions are valid until they are deregistered . In the most general definition, a subscription describes a Boolean filter expression (or simply filter expression) on event messages. Variables of this expression are called predi- cates, representing simple attribute filters. The concept of event messages is introduced in the following paragraph.
Publishing Event Messages. The incoming information in a pub-sub sys- tem is provided by publishers in the form of event messages (or simply messages or events within this dissertation). An individual publisher is denoted by P; an event message is abbreviated by e. The publisher of a particular event message e is referred to as P (e).
Generally, event messages are represented by attribute-value pairs in con- tent-based pub-sub systems.
Registering and Deregistering Advertisements. Publishers in pub-sub systems have to specify their future event messages and register these speci- fications with the system before sending messages. The term advertisements is widely used for the specifications of publishers. They are defined using an advertisement definition language (or just advertisement language).
The set of all advertisements is denoted by A, a particular set of adver- tisements by Ai, and an individual advertisement of this set by a ∈ Ai. Once
registered, advertisements need to be deregistered to become invalid.
In the most general definition, an advertisement (similarly to a subscrip- tion) describes a Boolean filter expression on event messages. An event mes- sage e conforms to an advertisement a if the filter expression of a evaluates to true on e. All event messages sent by a publisher need to conform to one of its registered advertisements. The set of event messages conforming to an advertisement a is denoted by E(a).
Sending Notifications. The answers to registered subscriptions are pro- vided by the content-based pub-sub system based on the content of the in- coming event messages. These answers are called notifications and are sent by the system to the respective subscribers. We abbreviate a particular notifica- tion by n.
The process of identifying all relevant subscriptions for an incoming mes- sage (i.e., those subscriptions whose filter expressions evaluates to true for this message) is generally referred to as filtering or event filtering. For a particular subscription s, the set of all relevant event messages is denoted by E(s).
When considering a pub-sub system as a black box, this filtering, in combi- nation with the delivery of notifications, is the main task of a pub-sub system: subscribers use the system in order to receive notifications according to their subscriptions. Publishers use the system in order to have their messages de- livered to all interested subscribers.
Publishers Subscribers
Subscriptions Advertisements
Event messages Notifications Operational Configurational interaction interaction Content−based publish/subscribe system
Figure 2.1: Overview of the interaction in pub-sub systems.
These interaction patterns build the means for users to communicate with the help of a pub-sub system. Clearly, the same user might simultaneously act as both subscriber and publisher. That is, the same user might be involved in all interaction patterns, and thus register and deregister subscriptions and advertisements, send event messages, and receive notifications.
We give an overview of the two kinds of users of a content-based pub- sub system and their potential interaction with this system in Figure 2.1. We split the different interaction patterns into two types: the configurational interaction, containing the registration of subscriptions and advertisements, and the operational interaction, including the publication of messages and the notification about messages.
The content-based pub-sub system is situated between publishers and sub- scribers and decouples [EFGK03] the communication between these two par- ties. In the literature, pub-sub systems as decoupling components have thus found variable descriptions, for example, mediator [BBC+04, EFGK03, LJ03]
and broker [BBC+04, HGM01, Leh05].
Correspondence in Database Management Systems
Having introduced the concepts of pub-sub systems, we now relate them to the widely known notions of database management systems. We do so based on two different viewpoints, the interaction semantics view and the data storage view [BH07].
Interaction Semantics View. Relating the concepts to an interaction se- mantics view, subscriptions represent database queries and event messages conform to data stored within the database. These correspondences stem from the fact that (i) subscriptions and queries denote user requests that are answered by the respective system, and (ii) event messages and stored data are the basis to provide these answers. This concept of answers, in turn, clearly corresponds to notifications in pub-sub systems and the results to queries in
System
Publish/subscribe system Database management system
Advertise− ments Results Notifications privileges access Schema/
Input to the system messages
Event Subscrip−
tions deletionsinsertions/ Updates/
Queries
Output to users System
Figure 2.2: Corresponding concepts between pub-sub systems and database management systems when taking an interaction semantics view (correspond- ing concepts are illustrated at the same position for both kinds of systems). database management systems.
For advertisements, however, one cannot clearly identify a counterpart in database management systems. Although we cannot find this exact equiva- lence, the database schema in combination with access privileges to particular tables or table columns can be seen as partially corresponding to advertise- ments. The advertisements in content-based pub-sub systems, though, are a more general concept. They not only describe the manipulation of a partic- ular type of data (published event messages) but also how the data will be manipulated (the content that will be sent in the future).
We give an overview of these corresponding concepts in content-based pub- sub and database management systems in Figure 2.2. The related notions are arranged in the same positions for both systems to allow for a better overview. However, there are deep differences between content-based pub-sub systems and database management systems that result from their opposite problem definitions and the implied need to handle data differently. As a consequence, the following observations hold when considering data storage in these systems. Data Storage View. Subscriptions are long-standing queries that are stored and continuously evaluated by a pub-sub system, until they are finally dereg- istered and removed from the system. These subscriptions therefore comprise what we call the subscription base. Because this subscription base needs to
Transient data Notifications Data base Results Queries Schema/ access privileges messages Event deletions insertions/ Updates/ ments Advertise− Subscrip− tions Publish/subscribe system Subscription/ Stored data
Database management system
base
Advertisement
Figure 2.3: Corresponding concepts between pub-sub systems and database management systems when considering the data storage (corresponding con- cepts are illustrated at the same position for both kinds of systems).
be stored within the system, it is the counterpart to the data that is stored in database management systems, the data base. Hence, these two concepts build a component that is known to the respective system.
Database queries, however, are not known in advance. They are sent to the database management system once, are subsequently executed, and finally the system returns the results to the issuer of the query. Similarly, this at- tribute holds for the individual event messages from publishers. At the (highly frequent) occurrence of event messages, the pub-sub system needs to find all relevant subscriptions in its subscription base. Thus, database queries and event messages are corresponding concepts in these systems.
The correspondence of the remaining concepts aligns with the findings when taking the interaction semantics view. Notifications conform to query results, and advertisements partially match with database schema and access privi- leges. The latter needs to be stored by the system. We refer to stored adver- tisements as the advertisement base in the following.
We illustrate these corresponding concepts in Figure 2.3. Again, the related notions are arranged in the same positions for both systems. Out of this figure, one can clearly identify the opposite problem definition of content-based pub- sub and database management systems. One system (database management system) answers queries based on stored data; for the other system (pub-sub system) incoming data (event messages) leads to answers (notifications) to stored queries (subscriptions).
management systems strongly influences the internal handling of stored data, that is, subscriptions and advertisements. We further elaborate on these ef- fects and their implications for the design of content-based pub-sub systems in Section 2.6.