• No results found

This chapter has introduced the Web query language Xcerpt and the reactive Web language XChange. These two languages are the context in which XChangeEQ is being developed: XChangeEQ, the topic of this thesis, employs Xcerpt to query XML data in simple events and it can be used inside XChange as a sublanguage for specifying complex events. All three languages, Xcerpt, XChange, and XChangeEQ, employ a pattern-based approach for dealing with Web data and together give a suite of languages for realizing common tasks on the Web involving querying and reasoning with regular Web data, reacting to and communicating events, updating Web data, and detecting and reasoning with complex events.

Part II

XChange

EQ

: An Expressive

High-Level Event Query Language

Chapter 5

Language Design

The language design of XChangeEQ follows a clear rationale based on a few core principles. At the very heart of XChangeEQ is the idea that event queries can be described according to the four dimensions data extraction, event composition, temporal and other relationships between events, and event accumulation (Section 5.1), and that an event query language must separate these four dimensions in order to achieve full expressivity (Section 5.2). With the emergence of the Web as a universal information system, XChangeEQ caters for the specific needs of querying and reasoning with events on the Web (Sections 5.3 and 5.4). XChangeEQ aims at being a declarative, easy-to- use language (Section 5.5) that comes with clear semantics (Section 5.6). Finally, acknowledging that a single language can often not solve all problems, XChangeEQ is designed with extensibility in mind (Section 5.7).

The language XChangeEQ has first been presented in [BE06a]. Issues related to its language design (and language design of complex event query languages in general) is also discussed in [BE06b, BE07d, BE07a, BE08b]

5.1

Four Dimensions of Querying Events

Characteristic for applications involving event queries is the need to (1) utilize data contained in the events, (2) detect patterns composed of multiple events (i.e., complex events), (3) reason about temporal and other relationships between events, and (4) accumulate events for negation and aggregation. We can understand these requirements as four complementary dimensions that we call data extraction, event composition, temporal (and other) relationships, and event accumulation. These four dimensions, which will be detailed shortly, must (at least) be considered for querying complex events. How well an event query language covers each of the dimensions gives a practical measure for its expressiveness.

Data extraction Events contain data that is relevant for applications to decide whether and how to react to them. The data of events must be extracted and provided (typically as bindings for variables) to test conditions (e.g., arithmetic expressions) inside the query, combine event data with persistent, non-event data (e.g., from a database), construct new events (e.g., by deductive rules, see Section 5.4), or trigger reactions (e.g., database updates).

Often, events are transmitted as messages in XML formats (cf. Chapter 2.4); examples for such message formats include SOAP [G+03], Common Base Event (CBE) [IBM04], and the Fa- cility Control Markup Language (FCML) [BLO+08]. Data in such XML messages can be semi- structured, i.e., have a quite complex and varying structure. This gives a strong motivation to build upon and embed an already existing XML query language into an event query language. Accordingly, XChangeEQ builds upon the XML query language Xcerpt; the advantages for us- ing Xcerpt over the standard XML query languages XQuery and XSLT have been outlined in Chapter 4.1.1.

Event composition To support complex events, i.e., events that consist of several events, event queries must support composition constructs such as the conjunction and disjunction of events (more precisely, of event queries). Composition must be sensitive to event data, which is often used to correlate and filter events (e.g., consider only stock transactions from thesame customer for composition). Event composition also gives rise to relative temporal events, that is, timer events that are defined relative to another event such as “2 hours after eventX.” Since reactions to events are usually sensitive to timing and order, an important question for complex events is when they are detected. In a well-designed language, it should be possible to recognize when reactions to a given event query are triggered without difficulty.

Temporal (and other) relationships Time plays an important role in event-driven applica- tions. Event queries must be able to express temporal conditions such as “eventsAandB happen within 1 hour, and A happens before B.” Temporal relationships between events can be quali- tative or quantitative. Qualitative relationships concern only the temporal order of events (e.g., “shipping after order”). Quantitative (or metric) relationships concern the actual time elapsed between events (e.g., “shipping and order more than 24 hours apart”).

Time takes dominating role in event processing, since it affects the timing and order of reactions to complex events. Therefore, we concentrate on temporal relationships in XChangeEQ. However, there might also be other relationships between events that are of interest in event queries. With these relationships, there is always a consideration whether they should just be considered as relationships between event data or deserve a special treatment. XChangeEQ’s language design is kept extensible so that special treatment of other relationships can be easily added in the same manner as temporal relationships.

For some event processing applications, it is interesting to look at causal relationships, e.g., to express queries such as “events Aand B happen, andAhas caused B.” While temporality and causality can be treated similarly in query syntax, causality raises interesting questions about how causal relationships can bedefined andmaintained. This issue will be discussed in Chapter 19.

In event processing applications involving geographically distributed event sources, spatial relationships between events can also be of interest. For example, a query might specify that two events occur within 100 meters of each other. Spatial information about events might, as noted above, be simply considered as ordinary data in events — more so than causality because there are less issues regarding the definition and maintenance of it. However, special treatment of spatial relationships might be of interest, e.g., in systems that involve mobile event sources and require spatio-temporal reasoning [Sch08]. Further, spatial relationships might play a role in distributed query evaluation. These issue will also be discussed in Chapter 19.

Event accumulation Event queries must be able to accumulate events to support non- monotonic query features such as negation of events (understood as their absence) or aggrega- tion of data from multiple events over time. The reason for this is that the event stream is —in contrast to extensional data in a database— unbounded (or “infinite”); one therefore has to de- fine a scope, e.g., a time interval, over which events are accumulated when aggregating data or querying the absence of events. Application examples where event accumulation is required are manifold. A business activity monitoring application might watch out for situations where “a customer’s order has not been fulfilled within 2 days” (negation). A stock market application might require notification if “theaverage of the reported stock prices over the last hour raises by 5%” (aggregation).