Not all events in our system were provided by the scenarios described in Section 2.7. To enable more diverse use cases and to alleviate a cold start of our system we add further real-time data sources. To that end we implemented several input adapters for well-known streaming data sources on the Web to be used by the scenarios.
Adding these existing sources of push-data addresses our Requirement R6: Push-data on the Web. We provide adapters for events from the Social Web as well as the Internet of Things (IoT) to demonstrate the diversity in existing data and the applicability of RDF to both fields.
Social Web sites such as Facebook and Twitter host a large amount of user-contributed material for a wide variety of events happening in the real-world. Events from Xively12further extend this range of events by
adding real-time data from devices around the world which people are sharing.
Namely, these data sources include: (i) A Facebook app which a user can allow to notify all Facebook Wall updates as RDF events. (2) A Xively
12Xively, a Web portal to connect sensor data: http://xively.com/ previously known as
7.6. Event Adapters 97
adapter which can subscribe to sensor readings and similar events from the IoT and can flexibly be transformed into RDF events. (3) A Twitter adapter which uses the Twitter API to receive tweets and convert them to RDF events.
The Facebook adapter. It consists of three modules. First, subscribing
and retrieving the information from Facebook. Second, transforming this information to RDF events. Third, using WS-Notification publish/ subscribe to deliver the events.
A Tomcat servlet is created for retrieving information and creating events. This application registers with Facebook to receive events in real-time. Whenever authenticated Facebook users post something on their Face- book Wall, a Facebook real-time notification is sent to our servlet using WebHooks. A WebHook is an HTTP callback: an HTTP POST request that occurs when something happens. The servlet then fetches the neces- sary data which is not part of the Facebook notification such as the user’s location and the full message content. After that, the data is transformed into RDF events. Those events are sent to the service bus of our system for use in the platform.
Listing 5.1 on page 50 shows an example event from the Facebook adapter. The listing demonstrates the use of all attributes currently in the schema. Some attributes are in the default namespace of our system (e.g. :status), some are in the namespace user: [Weaver and Tarjan 2012] defined by the Facebook Graph API (e.g. user:id).
The Xively adapter. It has the purpose of subscribing to sensor readings
and similar events from the IoT. Using the adapter, such events can be transformed into RDF events.
To connect Xively to the our system we implement another Java servlet. It is exposed to the Web in order for Xively to invoke it whenever there is new data using WebHooks. When the servlet is invoked, it parses the data received from Xively, converts it to RDF and creates an event instance using an event class from our SDK (cf. Section 5.4) specific to
Xively events. The data from Xively arrives as non-semantic JSON data. We lift the data to meaningful RDF from the structured JSON data in two consecutive steps according to [Norton and Krummenacher 2010]. The lifting is implemented as a SPARQL CONTRUCT query. First, JSON is converted to “naive” RDF by replicating only the structure, not the semantics. Then, CONSTRUCT queries are used like an RDF to RDF transformation. Meaningful RDF properties from well-known schemas can thereby be introduced in the event to increase interoperability between event producers and consumers. These properties replace the merely structural ones. This is done in order to make the results more usable as semantic events.
The Twitter adapter. It uses the Twitter API13 to receive tweets and
convert them to RDF events. To connect to the Twitter API we have im- plemented another dedicated Java servlet. It makes heavy use of the Twitter4J14library, a Java library for the Twitter API. The Twitter API “al-
lows high-throughput near real-time access to various subsets of public and protected Twitter data”. Public tweets are available from all users, filtered in various ways: By user id, by keyword, by random sampling and/or by geographic location.
Listing 7.3 shows an example Twitter event displaying properties from our schema. As a best practice in ontology design we not only define our own schema but re-use existing schemas to increase interoperability with other software and increase semantic understanding of our data. This addresses our Requirement R5: Ontology re-use. Thus, our schema uses event properties from the namespace sioc: in the SIOC ontology15on
line 21 to describe user generated content on the Web 2.0. Moreover, we reuse properties from the W3C Basic Geo Vocabulary16in the namespace
geo: on lines 18 and 19 to describe the location in a standardised way.
13Twitter API: https://dev.twitter.com/
14Twitter4J, Java library for the Twitter API: http://twitter4j.org/ 15SIOC Core Ontology Specification: http://sioc-project.org/ontology 16Basic Geo (WGS84 lat/long) Vocabulary: http://www.w3.org/2003/01/geo/