Scalable processing of aggregate functions for data streams in resource-constrained environments

(1)

B

ARCELONA

T

ECH

D

OCTORAL

T

HESIS

Scalable Processing of Aggregate

Functions for Data Streams in

Resource-Constrained Environments

Author:

Álvaro VILLALBA

Supervisor: Dr. David CARRERA

A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy

in the

(2)

(3)

Declaration of Authorship

I, Álvaro VILLALBA, declare that this thesis titled, “Scalable Processing of Aggregate Functions for Data Streams in Resource-Constrained Environments” and the work presented in it are my own. I confirm that:

• This work was done wholly or mainly while in candidature for a research de-gree at this University.

• Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated.

• Where I have consulted the published work of others, this is always clearly attributed.

• Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work.

• I have acknowledged all main sources of help.

• Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed my-self.

Signed:

(4)

(5)

UNIVERSITAT POLITÈCNICA DE CATALUNYA·BARCELONATECH

Abstract

Facultat d’Informàtica de Barcelona Departament d’Arquitectura de Computadors

Doctor of Philosophy

Scalable Processing of Aggregate Functions for Data Streams in Resource-Constrained Environments

(6)

The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large data centers, a common demand for cur-rently emerging scenarios is the need to process vast amounts of data with low laten-cies, generally performing the analysis process as close to the data source as possible. Devices and sensors generate streams of data across a diversity of locations and pro-tocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrich-ment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use.

Stream processing platforms are required to be malleable and absorb spikes gen-erated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. However, data analyt-ics might require to aggregate extensive windows of data. Approximate computing has been a central paradigm for decades in data analytics in order to improve the performance and reduce the needed resources, such as memory, computation time, bandwidth or energy. In exchange for these improvements, the aggregated results suffer from a level of inaccuracy that in some cases can be predicted and constrained. This doctoral thesis aims to demonstrate that it is possible to have constant-time and memory efficient aggregation functions with approximate computing mecha-nisms for constrained environments. In order to achieve this goal, the work has been structured in three research challenges.

First we introduce a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model defined by the applications that consume data. The subscription-based programing model enables multiple users to deploy their own data-processing services.

(7)

On top of this runtime, we present the Amortized Monoid Tree Aggregator gen-eral sliding window aggregation framework, which seamlessly combines the follow-ing features: amortizedO(1)time complexity and a worst-case ofO(logn)between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortizedO(1)computational cost for single evictions and supports bulk evictions with costO(logn); and it requires a local memory space ofO(logn). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributedKey-Value Storesto keep window elements and partial aggregations.

Specially motivated by edge computing scenarios, we contribute Approximate and Amortized Monoid Tree Aggregator (A2_{MTA). It is, to our knowledge, the first}

general purpose sliding window programable framework that combines constant-time aggregations with error bounded approximate computing techniques. A2MTA uses statistical analysis of the stream data in order to perform inaccurate aggrega-tions, providing a critical reduction of needed resources for massive stream data aggregation, and an improvement of performance.

(8)

(9)

Acknowledgements

Ara entenc el que vol dir trobar-se sobre les espatlles de gegants.

El treball presentat aquí i la meva consegüent carrera professional són el fruit d’una cadena de casualitats que m’han apropat a persones extraordinàries, a les que estic molt agraït per moltes raons. Aquesta cadena de casualitats té varies fites espe-cialment importants.

En acabar les assignatures d’enginyeria en informàtica i motivat pels excel·lents projectes finals d’amics que ja eren enginyers, vaig decidir el que volia fer pel meu projecte final de carrera, seguint les meves inquietuds i amb intenció de divertir-me: un manegador d’esdeveniments provocats per sensors, telemetria i programes de tercers. No hagués dit mai que la temàtica que estava escollint llavors seria la llavor d’un doctorat. El professor que em va acceptar el projecte sense posar cap inconvenient va ser el Juanjo Costa. Aquest projecte és un èxit en el meu expedient acadèmic, i això va ser gràcies als seus consells i indicacions setmanals.

Entrant ja al món laboral com a enginyer, el Juanjo em va recomanar que em mirés una oferta de feina publicada al taulell d’anuncis del Barcelona Supercomput-ing Center, i que portava un professor amb el que em va prometre que aprendria moltíssim: David Carrera.

Amb el David (director d’aquesta tesi) vaig entrar en un projecte per fer com-putació Big Data en streams de dades, que casualment era la evolució natural del que vaig començar a fer en el projecte final de carrera. A les dos setmanes de contractar-me, el David em presentava a IBM com l’expert en processat d’streams pel projecte conjunt que començàvem. Ja no havia marxa enrere, havia de tornar-me un expert o aquesta gent tant important s’ensumaria alguna cosa. Temps més tard, després de passar pel Síndrome de l’Impostor i d’aprendre molt, l’escena es repetia amb Cisco i després amb Microsoft. En molt poc temps vaig passar a treballar amb enginyers de primer nivell en projectes importants, tant de grans empreses com del propi BSC.

Treballant en aquests projectes, vaig tenir una idea que podia no dur enlloc: un concepte nou d’agregadors eficients per streams de dades. Ho vaig presentar al David, i ràpidament em va donar llum verda i els recursos per a que pogués

(10)

desenvolupar-ho. Ara aquella idea s’ha convertit en una patent que és la contribu-ció central d’aquesta tesi, la meva contribucontribu-ció inicial a una empresa de la que en sóc cofundador, i també un mal de cap que s’ha tornat crònic. Malgrat tot això, la raó per la que estic més agraït al David és per obrir-me els ulls i fer-me canviar d’ecosistema de càmeres fotogràfiques. Aquí en queda constància.

Per suposat, paral·lelament a tot això, he rebut molta ajuda i suport tant dels meus companys de la carrera com dels meus companys de feina. Amics meus, són els millors enginyers que he conegut i m’han obligat a millorar per poder apropar-me al seu nivell: Cesare, que ens vam conèixer el dia que vam coapropar-mençar els dos a treballar al BSC, i que en poc temps es va tornar un molt bon amic; Marcelo i Nicola, amb els que he compartit unes quantes cerveses mentre ens convencíem mútuament de que això del doctorat paga la pena; Josep Lluís i Alberto, que m’han ajudat amb la part d’estadística i a posar sobre paper les idees que tenia al cap; Òscar, Tom i Sergi, ex-companys al BSC i actuals companys en la recent aventura de fer una empresa amb el David, gràcies a ells ha sigut possible tancar aquest projecte al mateix temps que en començava un altre encara més ambiciós; I en general tots els companys que han passat pel C6-E201 i el K2M-S204b.

Vull també fer una menció especial a la persona amb la que he compartit tota aquesta experiència dia a dia, el meu principal suport i qui més estimo: Eli. Ha estat al meu costat quan em semblava que la lògica m’havia abandonat, ha sigut comprensiva quan sortíem a prendre algo però jo encara tenia el cap treballant, i ha cobert les meves absències quan se m’apropaven deadlines. Sense ella hagués perdut totalment la cordura fa temps.

Finalment vull agrair als meus pares pels valors que m’han inculcat durant tota la meva vida, i perque sempre han potenciat la meva curiositat i el meu interès per la ciència i la tecnologia.

(11)

List of Figures

1.1 Contribution’s milestones . . . 5

2.1 Big Data stream processing platforms basic architecture . . . 12

2.2 Computing a query on the Lambda Architecture . . . 18

2.3 Computing a query on the Kappa Architecture . . . 19

3.1 Lock-free asynchronous model used in ServIoTicy . . . 35

3.2 Old data discard. . . 37

3.3 Relation between a pipeline and its execution trees. . . 38

3.4 Representation of topology #3 . . . 40

3.5 Node latency by degree . . . 41

3.6 Stage latency by degree. . . 41

3.7 Types of tested pipeline, each one maximizing a property . . . 44

3.8 Time to dispatch a SU through an entire topology. . . 45

4.1 Log MTA Structure and Element Location Examples . . . 62

4.2 Log MTA Bulk Eviction. Monoid:max(x,y); WPS:total−old≥4. . . . 67

4.3 Log MTA KVS data structure . . . 67

4.4 Amortized MTA Structure and Element Location Examples. . . 69

4.5 AMTA single update eviction running example. . . 74

4.6 Average latency for constant-sized windows . . . 79

4.7 Window bulk eviction average latency, using different y-axis scales to show different details . . . 80

4.8 Average window size reached per allocated memory amount, for a 225 updates capacity. . . 83

(14)

5.1 Error generated by stream update buckets. Monoid: count; WSP:

count>10 . . . 94

5.2 Bulk eviction buckets. Predicted eviction: 6±1 . . . 99

5.3 A2MTA data-structure constrained with 6 leaves . . . 100

5.4 Effective error in a sum-like histogram . . . 104

5.5 Effective error in a constrained sum-like window. . . 105

5.6 Effective error in a max-like histogram . . . 105

5.7 Effective error in a constrained max-like window. . . 106

5.8 Effective error in a constrained hopping window . . . 106

5.9 Sum-like histogram: 0.1% error . . . 108

5.10 Max-like histogram: 105block size . . . 109

(15)

List of Tables

3.1 Pseudo-random topologies . . . 39

4.1 Sliding window frameworks comparison . . . 59

4.2 Window latencies in nanoseconds with different monoids and WSPs . 82

5.1 Scenario-specific bucket aggregation method’s footprint relative to AMTA’s103

(16)

(17)

List of Abbreviations

A2MTA Approximate &AmortizedMonoidTreeAggregator ADWIN AdaptiveWindow

AMTA AmortizedMonoidTreeAggregator API ApplicationProgrammingInterface ASA AzureStreamAnalytics

CAP Consistency,Availability &Partition (tolerance) CRUD Create,Replace,Update &Delete

DABA De-AmortizedBanker’sAggregator DAG DirectedAcyclicGraph

DPP DataProcessingPipelines FIFO FirstIn -FirstOut

IoT InternetofThings

JSON JavaScriptObjectNotation KDA KinesisDataAnalytics KVS Key-ValueStore

LMTA LogarithmicMonoidTreeAggregator MQTT MessageQueuingTelemetryTransport MTA MonoidTreeAggregator

OSS OpenSourceSoftware RA ReactiveAggregator

REST RepresentationalStateTransfer SO ServiceObject

SPL StreamProcessingLanguage SQL StructuredQueryLanguage

STOMP Simple (orStreaming)TextOrientedMessageProtocol SU SensorUpdate

(18)

SWAG Sliding-WindowAggregation WO WebObject

(19)

Chapter 1

Introduction

1.1 Motivation

Over the last years, Internet of Things (IoT) and Big Data platforms are clearly con-verging in terms of technologies, problems and approaches. IoT ecosystems gen-erate a vast amount of data that needs to be stored and processed, becoming a Big Data problem. IoT devices and sensors generate streams of data across a diversity of locations and protocols that in the end reach a central platform that is used to store and process it. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and or-ganized in repositories. In the former case, real-time processing technologies like Storm [19] [96] are required to operate on the data; in the latter batch processing like Hadoop [14] is of common use. Stream processing prioritizes low latency above throughput and is continuously calculating the results, in contrast to batch process-ing that gives preference to throughput and runs in larger time spans after accumu-lating larger amounts of new data. IoT use cases usually involve immediate reaction to event detection, or continuous telemetry monitoring. In such scenarios, low la-tency is a the priority and stream processing is a clear solution data analytics.

When an entity wants to access a feature’s data stream, i.e. last hour average temperature in a location, that entity has two main options to retrieve that data. The first option is to deploy the infrastructure needed to retrieve the data from scratch: sensors and related connectivity. The second option is to get access to an existing data stream from another entity that contains information related to the target fea-ture. From the previous example, a city council might have temperature sensors

(20)

deployed all over the city generating temperature data streams. If these streams are shared to the interested party and accessible from a stream processing platform, the only thing left to do would be to perform the average aggregation on the stream. It is cheaper to share data streams in a multi-tenant data stream processing platform, rather than deploy the same set of sensors per tenant. Moreover, an entity might be interested in the composition of several streams. Consider an entity interested on the wind chill factor. The wind chill factor is calculated using the wind velocity and the air temperature. This entity could use third party’s wind velocity and air temperature streams in order to make a continuous wind chill factor calculation.

The operations performed on the data might not depend on a single tenant, like the owner of the sensors or the owner of the data processing infrastructure. Fur-thermore, the results from the stream analytics on sensor updates end up being new streams and other third parties might be interested on them. So a pipeline of data stream operations might be performed by a combination of tenants, and would need to grow dynamically while it is running. This is a very demanding environment in which the execution topologies are potentially vast directed acyclic graphs (DAG), with each vertex being an operation that in some cases might be challenging to run in terms of time and space. With such an execution topology, the vertices need to be loaded to memory dynamically whenever they receive an stream update. Other-wise the resources will easily become scarce. We refer to dynamic pipelining as the combination of operations to subscribing to an existing stream on-the-fly while the operations are only loaded when need to compute a stream update.

Aggregate functions operate on extensive amounts of data to produce a single result. Big Data traditionally solves the problem of data aggregation with batch pro-cessing. Batch processing uses programming models such as MapReduce with effi-cient algorithms. Such programming models enable effieffi-cient and linear scalable data analytics of massive amount of data with high throughput and fault tolerance. This linear scalability consists on distributing the computation and storage of the data, which by replicating said data we can also obtain fault tolerance. If more computa-tion resources are needed, it can be solved by just adding new computacomputa-tion nodes to the batch processing system.

(21)

relatively small amounts of data because it lacks efficient aggregators for massive data that fits the following requirements for real-time computation:

• Efficient programming model.

• Low-latency incremental computation.

• Fault-tolerance.

• Constantly distributed and replicated data.

Since a data stream is virtually infinite, the data to be aggregated needs to be narrowed down in stream sliding windows, i.e. data from the last year. This also the case of batch processing, where a batch might also contain data from last year. Nevertheless, batch processing provides a single results from a closed set of data, and a new batch is required to produce a new result. Although the throughput is high, the latency to produce this single result is also relatively high. A stream pro-cessing sliding window generates a stream of real-time results as the time interval shifts forward, with very low-latency for the computation of each incremental re-sult. As a consequence of the lack of Big Data aggregators for stream processing, the window aggregators are not linearly scalable. Therefore, batch processing is tra-ditionally used for massive data aggregations while stream processing is used for continuous aggregations of constantly changing small data.

In this work, we will demonstrate that it is possible to use batch processing paradigms for stream processing aggregations, getting as a result continuous ag-gregations of vast amounts of data.

Aside from the scalable and distributed programming models, there are other paradigms that are relevant for the aggregation of Big Data. One of the paradigms widely used in Big Data batch processing is Approximate Computing. Approximate Computing improves the performance of data analytics algorithms and decreases the resources needed. However, the results of Approximate Computing algorithms may have some degree of inaccuracy. The computation strategies in Big Data Ap-proximate Computing are usually software-level approximation rather than hard-ware, such as memoization, skipping loops in iterations or skipping data elements

(22)

in an aggregation. The inaccuracy can be predicted, with a margin of error, and con-trolled or even limited. In a linearly scalable environment, Approximate Computing not only can increase throughput and reduce latencies, but also would reduce the number of computation nodes needed for an aggregation making it cheaper.

We want to avoid to the possible extent that an operation as relevant for data analytics as an aggregate function becomes a bottleneck in a data processing pipeline that will chain the latencies of multiple operations. The aggregate functions require to have an update computation time close to the update input frequency.

The tenant-shared execution environment that we described requires stream op-erations and their data to be only loaded when needed, and to load a minimal amount of data. Having the data distributed and replicated in a scalable data store frees local resources and makes the aggregation fault-tolerant. If also only a little portion of that data is used on each aggregation, it enables the operation to be loaded fast when needed and frees even more local resources.

The contents exposed in here motivated the following Doctoral Thesis statement:

It is possible to leverage dynamically pipelined topologies to combine scalable stream process-ing with the approximate computprocess-ing paradigm to build efficient slidprocess-ing window aggregators for resource-constrained environments.

In order to achieve this goal, we divided the work into three main contributions. The first one provides a stream processing platform with dynamically pipelined topologies. The second one is a constant-time and footprint efficient framework for general purpose sliding window aggregations. The third contribution applies approximate computing mechanisms to the sliding window framework.

1.2 Contributions and Publications

The contributions of this Doctoral Thesis aim to deliver a scalable and efficient ag-gregate functions framework for massive data in stream processing. This goal will

(23)

Cloud Edge Constrained Edge

Composite Streams Scalable Aggregation Limit Resources

Contribution 1 Dynamic Pipelines Stream processing framework Contribution 2 Constant-time Window Aggregation Sliding Window framework Contribution 3 Approximate Window Computing Error bound aggregation Scope Value

FIGURE1.1: Contribution’s milestones

be achieved by incrementally pushing each contribution towards resource-scarce en-vironments execution, such as Fog or Edge computing deployments. Furthermore, there will be a constant focus on having a fairly simple programming model to de-fine aggregate functions.

The chart in Figure1.1shows in its axes the progression of the main two goals of this Thesis. Theyaxis represents the different increasingly resource-constrained scopes this work covers, while thex axis represents the development in the value it achieves in terms of computational efficiency. Each contribution can be found by crossing the milestones from each goal.

The work in this Doctoral Thesis is divided in the following three main contribu-tions supported by multiple peer-reviewed publicacontribu-tions.

1.2.1 Dynamically Pipelined Processing for Composite Data Streams

Devices and sensors generate streams of data across a diversity of locations and pro-tocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrich-ment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use.

This contribution introduces a runtime to dynamically construct data stream pro-cessing topologies based on user-supplied code. These dynamic topologies are built

(24)

on-the-fly using a data subscription model defined by the applications that consume data. Each user-defined processing unit is called a Service Object. Every Service Object consumes input data streams and may produce output streams that others can consume. The subscription-based programming model enables multiple users to deploy their own data-processing services. The runtime does the dynamic for-warding of data and execution of Service Objects from different users. Data streams can originate in real-world devices or they can be the outputs of Service Objects. Furthermore, a Service Object can subscribe to multiple streams to produce a single composite stream.

The runtime leverages Apache STORM for parallel data processing, that com-bined with dynamic user-code injection enables multi-tenant stream processing topolo-gies. In this work we describe the runtime, its features and implementation details, as well as a performance evaluation of some of its core components.

This contribution is supported by the following publications:

• Villalba, Á., Pérez, J. L., Carrera, D., Pedrinaci, C., & Panziera, L. (2015). servI-oTicy and iServe: a Scalable Platform for Mining the IoT. Procedia Computer Science, 52, 1022-1027.

• Villalba, Á., & Carrera, D. (2018, August). Multi-tenant Pub/Sub Processing for Real-Time Data Streams. InEuropean Conference on Parallel Processing(pp. 251-262). Springer, Cham.

• Pérez, J. L., Villalba, Á., Carrera, D., Larizgoitia, I., & Trifa, V. (2014, April). The COMPOSE API for the internet of things. InProceedings of the 23rd International Conference on World Wide Web(pp. 971-976). ACM.

1.2.2 Constant-Time Sliding Window Framework with Reduced Memory

Footprint and Efficient Bulk Evictions

The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the

(25)

monitoring of telemetry generated in large data centers, a common demand for cur-rently emerging scenarios is the need to process vast amounts of data with low la-tencies, generally performing the analysis process as close to the data source as pos-sible. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggre-gate data with minimal computational cost need to be developed.

In this contribution we present the Monoid Tree Aggregator general sliding win-dow aggregation framework, which seamlessly combines the following features: amortizedO(1)time complexity and a worst-case ofO(logn)between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amor-tizedO(1)computational cost for single evictions and supports bulk evictions with cost O(logn); and it requires a local memory space of O(logn). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed

Key-Value Storesto keep window elements and partial aggregations. This contribution is supported by the following publications:

• Villalba, Á., Berral, J. L., & Carrera, D. (2018). Constant-Time Sliding Win-dow Framework with Reduced Memory Footprint and Efficient Bulk Evic-tions.IEEE Transactions on Parallel and Distributed Systems.

• Villalba, Á., & Carrera D. (2017) Distributed data structures for sliding window aggregation or similar applications. European Patent EP17382202.4, filed May 30, 2017.

1.2.3 Approximate Sliding Window Framework with Error Control

The principal kind of aggregator for data streams is the sliding window, which de-fines boundaries on the aggregated stream values. However, data analytics might require to aggregate extensive windows of data. Approximate computing has been a

(26)

central paradigm for decades in data analytics in order to improve the performance and reduce the needed resources, such as memory, computation time, bandwidth or energy. In exchange for these improvements, the aggregated results suffer from a level of inaccuracy that in some cases can be predicted and constrained.

In this contribution we present the Approximate and Amortized Monoid Tree Aggregator (A2MTA). It is, to our knowledge, the first general purpose sliding win-dow programable framework that combines constant-time aggregations with error bounded approximate computing techniques. It is very suitable for adverse stream processing environments, such as resource scarce multi-tenant edge computing. The framework can compute aggregations over multiple data dimensions, error bound-ing any of them, and has been designed to support decouplbound-ing computation and data storage through the use of distributed Key-Value Stores to keep window ele-ments and partial aggregations.

This contribution is supported by the following publication:

• Villalba, Á., & Carrera, D. (2019). Constant-Time Approximate Sliding Win-dow Framework with Error Control. 22nd IEEE International Symposium On Real-time Computing.

(27)

Chapter 2

Background

This chapter sets a conceptual baseline on data stream processing for the rest of the Doctoral Thesis. The goal is to familiarize the reader to existing concepts that configure this research field.

2.1 General Concepts

In this section we introduce general concepts that are central to stream processing, with the goal to help with the comprehension of the rest of the work.

The concepts listed next are the main objects of discussion, in which everything else is based:

• Update: A time-dependent change on the state of a data feature. A data feature refers to information of a specific object or scenario, i.e. current temperature in Barcelona. Updates from the same data feature share the data structure and differ on its values. All updates’ data structures usually have timestamp and offset fields, which sets temporal context in the data feature and combined pro-vide a unique ID to the update. Updates are the atomic unit in a data stream.

• Data stream: Unbound sequences of ordered atomic updates on the same data feature. E.g., a stream associated to the temperature of a physical device D contains a sequence of updates of such temperature information coming from device D, each update replacing the previous one. A stream emits updates indefinitely, they do not have finite size and lack boundaries.

• Stream processing: Transformation of one or more updates streams into one or more derivative update streams. Output stream updates are triggered by input

(28)

stream updates. Stream processing can be simply transforming updates one to one from input to output, i.e. transforming temperature update values from Fahrenheit to Celsius degrees. However, the output streams can be the result of complex data analytics updates, i.e. anomaly detection in telemetry streams using Kalman filters. One output streams might aggregate several updates from several input streams.

• Stream operator: From a high level perspective, stream processing is performed using atomic stream operators. Stream operators are ideally low-latency op-erators for stream updates. The number of streams or updates from those streams computed as operands depends on the operator. Examples would be transform, filter, or window aggregation.

• Partition or Shard: Stream processing can parallelize the computation at three levels; at stream level, partition level and operation level. Different streams run in parallel as they are independent from each other. Operations can have their inner mechanisms to also run in multiple threads to speed up its execution. Partitions are stream divisions by some criteria that run in parallel. However, the same operations are applied to all the partitions from the same stream. Strong ordering between partitions can only be guaranteed by buffering them and performing a sort algorithm right before merging. Partition division are usually represented by theGroup BySQL operator.

• Stream processing node: In a stream processing platform, pipelined operators can be grouped in different nodes in order to improve the job throughput. Dividing the computation into several nodes improves throughput, but it can add latency because of the transport of updates between nodes.

All the modern scalable and distributed stream processing platforms for Big Data share the same elements and basic structure in their architectures, which are the following:

• Producers: External to the platform itself, producers send data streams to be processed to the platform, i.e. readings from a sensor. Therefore, they need

(29)

connectivity to the platform. Multiple producers can emit updates of the same stream, and one producer can emit updates from multiple streams.

• Consumers: Like producers, consumers are external to the platform. They collect the updates generated by the stream processing platform to use them somehow. For example, consumers can trigger actuators from the analysis of sensor streams. Multiple consumers can read the same stream, and multiple streams can be read from one consumer.

• Queue messaging system: It is in charge of update communication between con-sumers/producers and the platform. The communication channels are di-vided by partitions or topics. Each partition is usually treated as an indepen-dent queue, although the in/out policies do not need to be FIFO. Partitions are strongly ordered, and so consumers with a FIFO policy will receive updates in the same order as they were received by the partition. Other policies affect consumers apart from in/out policies, i.e. maximum number of retained mes-sages or retention time interval. Queues messaging systems can be distributed and have partitions and/or partition replicas in different machines.

• Topology: The actual stream computation happens in the topology, which is a computation pipelines’ directed acyclic graph (DAG). Each node contains a section of the computation that will performed on an update. Nodes run in parallel and can be replicated to improve throughput. The edges between the nodes define how the updates flow between the nodes. A node with multiple output edges will emit updates to a subset of these edges per input update. Pipelines have source nodes that retrieve messages from the input queue mes-saging system, and sink nodes that publish the updates to the output queue messaging system. Sources can retrieve updates from multiple partitions and sinks can emit updates to multiple partitions.

Figure2.1is a general diagram of a stream processing platform architecture and the update flow from producer to consumer.

There is also a set of characteristics that differentiates the different stream pro-cessing platforms, that makes them more convenient in specific situations. Some

(30)

FIGURE2.1: Big Data stream processing platforms basic architecture

of these characteristics are simply performance metrics like latency and throughput, which generally one is increased by decreasing the other. Furthermore, these metrics are also affected by the following characteristics:

• Strong ordering guarantee: Processed updates can either be emitted in the same order they were generated or not. Update order can be altered by parallel computation of the updates. Most stream processing platforms can guarantee strong ordering if necessary.

• Update processing guarantees: Failures can happen on the pipeline while com-puting a number of updates. Update processing guarantee defines the implica-tions for the updates being processed in case of a failure, in terms of how many times an update can be processed and emitted to the output queue. There are four options.

– At least once: It guarantees that each update inserted in the input queue will be processed, but it does not specify how many times. The same output update can be emitted multiple times.

– At most once: Updates are not processed and emitted more than once, but some updates might be lost due to a failure.

– Exactly once: All updates are always processed once and only once, re-gardless of failures.

– None: There is no guarantee on how many times an update will be pro-cessed, the behavior is a best effort reducing the loss and repetition of updates.

(31)

• Fault-tolerance method: In order to enforce the strong ordering and update processing guarantees, some update processing control procedures need to be enforced. The following are some examples:

– Update acknowledgement: Each update that has been processed from a topology node sends back to the previous node an acknowledgement that it has been processed. The source of the topology keeps a backup of all the tuples it generates. Once a source update has received acknowl-edgements from all generated updates until the sinks, it can safely be discarded from the upstream backup. At failure, if not all acknowledge-ments have been received, then the source update is replayed. This guar-antees no data loss, but does result in out of order updates and duplicate updates passing through the system (at least once processing). Update acknowledgment also works as part of a backpressure handling mecha-nism, having control on all the updates on-flight in the topology.

– Micro batches: In order to overcome the complexity and overhead of update-level synchronization that comes with the model of continuous operators that process and buffer updates, a continuous computation is broken down in a series of small, atomic batch jobs (called micro-batches) with a transactional id assigned. Each micro-batch may either succeed or fail. At a failure, the latest micro-batch can be simply recomputed. This method enforces exactly once processing and strong ordering, degrading the computation latency. However, with batch related operators, it can improve throughput.

– Transactional updates: Atomically log update deliveries together with updates to the state. Upon failure, state and record deliveries are repeated from the log. This guarantees exactly once processing and strong order-ing.

– Checkpointing: This scenario can be considered a composition of micro-batch and transactional update mechanisms. During intervals of updates,

(32)

nodes update their state in a distributed snapshot so they can be recov-ered on failure. For the sinks, they buffer the updates up to the next check-point, and then emits them all together.

• Pull/push communication: Update communication between topology nodes can be either pull-based or push-based. Pull communication will require a node to request for more updates from the previous nodes when it is free. This method is not the most latency efficient, but provides a very straightfor-ward backpressure handle mechanism. Furthermore, as messages are passed between nodes as batches, it has good throughput performance. Push-based communication consists on nodes actively sending updates to the following nodes, which will store the updates in input buffers until they are processed. This method is more latency efficient.

• Backpressure handling: Backpressure is the situation in which the input up-date rate is higher than the processed upup-date rate. Backpressure handling mechanisms rely in buffers and a durable queue-based messaging system. Up-date drop policies can be applied in such mechanisms.

2.2 Stream Processing Platforms

In the last decade and during the course of this work, there have been great efforts from different fronts on the research and development of stream processing plat-forms and programming models for scalable big data stream analytics. Two of the most relevant fronts on these efforts are the open-source community and the com-mercial cloud providers.

The open-source community generated multiple widely-adopted platforms for the computation of data stream analytics.Apache Storm[19], first Backtype Storm, has been a popular stream processing platform since its release in 2011. Its run-time works on JVM and it is written in Clojure, a JVM language based on Erlang with its main focus on parallel computation. Storm is a multi-language runtime, thanks to working with an Apache Thrift [20] definition in its core that disengages the topology code’s language from the runtime execution. Storm initially based its

(33)

runtime on ZeroMQ [56] sockets with apub-subpattern between stream processing nodes, which later became optional and were replaced by default by the more JVM-specific Netty [80] sockets. Storm, like most modern stream processing platforms, is usually paired with Kafka as an input queue. Storms work in a fairly low-level on which each computation node is programmed and the connections between them configured by the user. It guarantees an at-least-once update processing and uses the update acknowledgment fault tolerance method. However, it has a high-level abstraction called Trident which organizes the topology from a query-like instruc-tion from the user. When Trident is used, Storm can guarantee exactly-once update processing and works with micro-batches.

Apache Flink[13] has become a well-known stream processing platform, since its first release in 2015. Running in JVM and written in Scala and Java, it is built upon the Akka toolkit. Topologies can be developed in Scala, Java, Python and SQL through its APIs: DataStream, DataSet and Table. While DataStream provide an API for both bounded and unbounded data streams, DataSet works only for bounded streams. Table API is more high-level and it is programmed in a SQL-like language. Flink also works with the Apache Beam programming model, an open-source uni-fied programming model to generate topologies for both batch and stream process-ing. Flink provides exactly-once update processing and its fault tolerance is based on checkpointing. Furthermore,Akka[5] also provides a stream processing runtime by itself with a rich set of operators. It can be programmed in Scala and Java, and among its characteristics it guarantees an at-most-once update processing.

A more throughput-centered platform can be found inApache Spark Stream-ing[18]. Spark Streaming shares API with Spark, so streaming jobs are programmed the same way as batch jobs. Spark Streaming jobs can be written in Java, Scala and Python. Instead of performing batch jobs, as Apache Spark is designed to, it reduces the size of the batches to micro-batches and so it can use the same logic. It guarantees exactly-once update processing.

Apache Kafka Streams [15] leverages a Java library for stream processing for Kafka client applications. Instead of using ZeroMQ, Netty or Akka for update com-munication between stream computation nodes, Kafka is the main messaging bus

(34)

and not only the input queue system. It provides means to distribute and paral-lelize a stream topology and operators to perform efficient analytics. It guarantees exactly-once update processing.

The main cloud providers also offer their own stream processing platforms inte-grated with the rest of their commercial solutions.Microsoft Azure Stream Analyt-ics(ASA) is the stream processing service in Microsoft Azure. ASA is programmed in a SQL-like language that considers stream updates rows in a database table. The query runs continuously with each result row being a new stream update. That SQL-like query is compiled to generate a topology that uses Trill [38] as a query pro-cessor. ASA can also be executed on the edge in order to improve latency by using the Azure IoT Edge ecosystem. It guarantees at-least-once update processing.

Amazon Kinesis Data Analytics(KDA) is Amazon Web Services’ stream data stream processing service. Like ASA, KDA is programmed in SQL which in turn generates the stream processing topology. It offers a selection of pre-built stream processing templates and advanced high level operators. KDA uses an at-least-once processing and delivery model in the event of an application interruption for various reasons.

Google Cloud Dataflowis a cloud service that computes both batches and streams, with Apache Beam as its programming model. Dataflow provides exactly-once up-date processing guarantee.

IBM Streaming Analytics is a platform that can either run on premise or in IBM Cloud as a service. Its programming language is the Stream Processing Lan-guage (SPL), a topology composition specific lanLan-guage with operators that can be programmed in Java or C++. SPL has a rich set of toolkits and efficient operators for data streams along. Like most of the other platforms, IBM Streaming Analytics guarantees at-least once update processing.

2.3 Big Data Architectures

Stream processing can be found in multiple Big Data architectures depending on what problem it is solving. However, there are two main architecture trends called

(35)

Lambda ArchitectureandKappa Architecture. The first one sets aside stream process-ing, which is used only to provide immediate partial results while using batch pro-cessing to perform the goal analytics. Kappa Architectureis completely centered on stream processing and considers all data produced as immutable, continuously per-forming reliable analytics on the input data. We propose an scalable system consis-tent withKappa Architecturefor large computation topologies, with efficient aggre-gators.

In this section we make a summary of the two architectures.

2.3.1 Lambda Architecture

Batch processing and real-time has been widely considered to complement each other. In summary, real-time technologies are usually seen as fast but complex and unreliable in terms of fault-tolerance and consistency. On the other hand, batch pro-cessing is considered the robust option because of its simplicity, but it has a big de-lay from update to update. The architecture combining both kinds of technologies to process data is known as Lambda Architecture [75] [74]. This architecture aims to avoid the CAP theorem [52] problems when sacrificing consistency.

To avoid (or minimize) the CAP problem, the Lambda Architecture considers three main conceptual characteristics of the data and queries. The first characteristic is that the data is inherently time based. When a new atomic piece of data on a dataset is received, the information it gathers is always true when you consider its temporal context. For instance, an update on the temperature in Barcelona might be 30 degrees Celsius today at 17:00. Later a new update can be 29 degrees Celsius at 18:00. This new update does not invalidate the previous one, it is still true that at 17:00 Barcelona was at 30 degrees Celsius.

That leads to the second characteristic. Being the data time based, the data is immutable. The functions on storage must be CR instead of the typical CRUD. This is not a new approach to deal with data by several processes in parallel. For instance, it is very usual for functional programming languages like Erlang [23] to work with immutable variables, for this same reason.

(36)

FIGURE2.2: Computing a query on the Lambda Architecture

The third main characteristic is more related to the queries on the data. Consid-ering a query any function using a data set as an input, the queries must be pre-computed before they are addressed using incremental algorithms. Reliable batch processing technologies like Hadoop take the main responsibility of this part. The description of this architecture defends that the responsibility falls on batch process-ing because of its robustness given by the simplicity of the databases it relies on. The databases for batch processing like Voldemort [89] only have support for batch writes, avoiding the inherent complexity of random writes.

Batch processing operates on the whole data set with sets of hours of data, and so an atomic update might take hours to be processed. The real-time layer works in parallel with the batch layer to cover the last few hours of data that have not been taken care of by the batch layer. The reasoning behind this is that real-time tech-nologies perform better on little sets of data for solving consistency issues. Usually Storm is used for this purpose with databases like Cassandra [93], Couchbase [42].

This situation leaves the system with two discrete views of the data, batch view and real-time view. The merge of the two views is left for the query invocation time, as it can be seen on Figure2.2.

The set-up of this architecture, including the replication of code for both the batch processing layer and real-time layer, is up to the developers. However, Twitter re-leased Summingbird [34] as an open-source project. Summingbird works as a Scala abstraction to the Lambda Architecture. Data analytics are written once in Scala and are deployed to Hadoop and Storm. Summingbird leverages Algebird [7] ag-gregators. Algebird enforces a programming model in which theReducephase of

(37)

FIGURE2.3: Computing a query on the Kappa Architecture

MapReduceis an associative function with identical types for the inputs and the out-put (called semigroup or monoid [73]). Associativity enables the platform for high parallelism and very fast real-time aggregations.

2.3.2 Kappa Architecture

The Lambda Architecture has important downsides. It needs two different platforms two run the same analytics on the data. These analytics are written for two different programming models and so it is duplicated. Summingbird isolates this from the development, but still the single code is translated to the batch and real-time layers. On the one hand this makes it difficult to debug, and on the other hand the two platforms are still there.

The claimed motivations for such an architecture are that real-time processing is generally said to be less powerful and less reliable than batch processing, and that with this mixture of different data systems the CAP theorem is avoided.

The Kappa Architecture [67] is defined as an improvement over the Lambda Ar-chitecture, considering the two previous motivations as false. Although it is true that batch processing technologies are much more mature than real-time technolo-gies, that does not mean that real-time technologies need to be more unstable. Fur-thermore, even if batch-write only databases are simpler, the Lambda Architecture does not achieve to beat the CAP theorem [28].

This criticism to the Lambda Architecture leads into the conclusion that the batch processing layer is not necessary, as it is shown in Figure2.3. Some implementations of the Kappa Architecture are Samza [16] or Flink [13].

2.4 Operations on IoT data

In this section we identify the kinds of operations that we identified as usual to perform analytics on sensor generated data streams and their derivative streams,

(38)

and that also have been adopted by high-level stream processing platforms during the course of this work. All these operations need to work efficiently in order to provide an environment suitable for the assimilation ofKappa Architecture, and in some cases they have to follow a set of rules in order to keep time consistency on the results.

IoT generated data streams are characterized to be generally produced originally by sensors producing synchronous telemetry from multiple dimensions of specific features. Although the updates from a data stream might not be in the same order as they were sensed, the streams have a strict time order which can be identified by a generation timestamp in each update. There exist mechanisms in order to act upon unsorted data stream updates, such as micro-batches and stream buffers [4]. However, the reordering of data streams is out of the scope of this work.

Most of the operations described in this section are meant to produce new deriva-tive data streams. Each update is operated either alone or with other updates, and therefore produce a new result update that will follow another data stream. New derivative streams can also have operators applied to them. These sequences of op-erations will be called Data Processing Pipelines (DPP) from now on.

2.4.1 Index and query

A very basic operation on a data set is performing queries to obtain a subset of data from it. In order to do that, the data needs to be properly indexed as it is stored. This kind of operation can be found on any relational database and in most Key-Value stores (KVS). In stream processing, indexing and querying historical data is not gen-erally a desirable scenario. Streams are unbounded, and therefore they will contain virtually infinite data. When used in the context of data analytics, queries usually perform aggregations. In the specific scenario of stream processing, this translates to a pipeline of operations that efficiently produce incremental aggregations from a set of stream updates. Any query with the final purpose to aggregate the data results should be computed while the data is being produced. However, there are frequent and viable kinds of queries related to IoT data stream updates. It usually requires a small window of updates in a stream, like the newer update. For example, it is very usual to query the last update of a stream or all the stream updates produced near

(39)

an specific location. To be able of doing such queries with a big volume of streams in little time, the updates need to be indexed when received. Elasticsearch [47] or Solr [17] are search engines that provide such functionality.

2.4.2 Filter

In a DPP, it is very common to discard updates that do not follow some parameters. A filter is a set of conditions applied to the input update that, when not fulfilled, the update will not continue on that branch of the processing pipeline. It acts upon a single stream. Sometimes we are only interested on updating values inside a thresh-old or avoiding clearly erroneous values. For example, detecting sound peaks in decibels or working only with extreme temperatures in order to trigger an action.

The kind of conditions found in a filter are expected to be resolved inO(1)time orO(k), beingkthe number of data dimensions found in each update. Filters do not require any kind of update memory storage or persistence.

2.4.3 Transform

Similarly to filters, transformations are operations with a single input stream. In the MapReduce programming model, a transformation would be themapphase. It applies the same transformation to every input update, generating a new stream. For instance, a transformation can perform unit conversions or reconfigure the data dimensions in the updates in order to perform an aggregation in further stages. The code of the transformation would be provided by the user of the operator.

A transformation is expected to be resolved in O(1) time orO(k), beingk the number of data dimensions found in each update. Furthermore, no update memory storage or persistence is required to perform a transformation.

2.4.4 Aggregate

Data aggregation is the most complex operation, because it potentially involves vast amounts of data stream updates. Aggregations performed with theLambda Architec-turerely on most of the computational cost of a final aggregation being done with

(40)

batch processing. However in theKappa Architecture we do not rely on batch pro-cessing, and a new total aggregation result must be provided for each new stream update. In this work we will focus on aggregator frameworks, in which the user can define the updates to be aggregated and the specific aggregation to be performed, following an specific programming model. In the MapReduce programming model, all of them would be thereducephase, in contrast to the transformations.

There are different kinds of aggregators, depending on the origin data being ag-gregated. Some examples of aggregator frameworks would be:

• Accumulator: Aggregates all-time updates. For example, the all-time average temperature from a sensor stream. Its cost can beO(1)using binary associative operations. It only requires one update stored on memory or persisted, for the current aggregation result.

• Sliding Window: It performs an aggregation of a subsequence of updates from the stream, always including the newest update. For instance, it would aggre-gate last hour temperature from a sensor stream. As it will be demonstrated in this work, its cost can beO(1)using binary associative operations. Further-more, it has aO(n)memory cost, beingnthe number of updates aggregated in the window. However, that memory cost can be reduced in exchange of losing aggregation accuracy.

2.4.5 Union

Multiple streams might produce complementary updates on the same feature. For example, multiple temperature sensors might produce updates at different times. The updates will be produced in different streams, but they can be merged in a single richer stream. This require that the streams are equivalent in form, with same data dimensions, types and units. Transformations can be used to adapt different streams to the same format.

This operation only requires two or more streams that will be re-emitted under the same stream, therefore with a negligible cost by itself. Furthermore, it does not require to update memory storage or persistence.

(41)

2.4.6 Group

Partitions are divisions from a stream, each with the same inner structure in its up-dates. Updates from the same stream will be placed on different partitions depend-ing on some criteria. The group operation defines that criteria. One way to define it is the same as in SQLGroup Byoperation; placing in the same partition those updates that share the same value in an specific channel.

This operation requires an input stream, which will be partitioned by the user defined criteria. This criteria might imply some computation to combine different channels in the update, but its cost is expected to beO(1). Updates are simply emit-ted from one partition to another, no memory storage or persistence is required.

2.4.7 Compose

Sometimes calledZipoperator, it is a function with a closed set of parameters, each parameter a different stream, that combines their last updates to produce a new stream. The input streams do not need to be equivalent, and in some cases the num-ber of these inputs is closed to two. If more input streams need to be operated, then more compose operations can be pipelined. It can be seen as the SQL Join opera-tor, where rows from different tables are aggregated into a single new value/row. An example of using the compose operation would be to use a wind speed sensor stream and a temperature sensor stream to produce a wind chill stream.

This operation requires to keep the last update from each input stream in order to use them when a computation is triggered. Beingnthe number of input streams, it isO(n)memory-wise and it is expected to be O(n)time-wise with a small n in order to not become a bottle neck.

(42)

(43)

Chapter 3

Dynamically Pipelined Processing

for Composite Data Streams

3.1 Introduction

In the last years, Big Data and Internet of Things (IoT) platforms are clearly converg-ing in terms of technologies, problems and approaches. IoT ecosystems generate a vast amount of data that needs to be stored and processed, becoming a Big Data problem. Devices and sensors generate streams of data across a diversity of loca-tions and protocols that in the end reach a central platform that is used to store and process it. Processing can be done in real time, with transformations and enrich-ment happening on-the-fly, but it can also happen after data is stored and organized in repositories.

This situation implies an increasing demand for advanced data streams manage-ment and processing platforms. Such platforms require multiple protocols support for extended connectivity with the objects. But also need to exhibit uniform internal data organization and advanced data processing capabilities to fulfill the demands of the application and services that consume these streams of data.

To provide answer to this growing demand, ServIoTicy1_{is a state-of-the-art} plat-form for hosting real-time data stream workloads in the Cloud. It provides multi-tenant data stream processing capabilities, a REST API, data analytics, advanced queries and multi-protocol support in a combination of advanced data-centric ser-vices. The main focus of ServIoTicy is to provide a rich set of features to store and

(44)

process data through its REST API, allowing objects, services and humans to ac-cess the information produced by the devices connected to the platform. ServIoTicy allows for a real time processing of device-generated data, and enables for simple creation of data transformation pipelines using user generated logic. Unlike tradi-tional service composition approaches, usually focused on addressing the problems of functional composition of existing services, one of the goals of the ServIoTicy is to focus on data processing scalability. Other components that can be connected to ServIoTicy provide added capabilities to automatically create compositions of high-level services using existing tools [84].

The core of the ServIoTicy runtime relies on a novel programming model that allows users to dynamically construct data stream processing topologies based on user-supplied code. These topologies are built on-the-fly according to a data scription model defined by the applications that consume data. Once a stream sub-scriber finishes its work, it is freed from the platform until it is needed again. Each user-defined processing unit is called a Service Object (SO). Every Service Object consumes input data streams and may produce output streams that others can con-sume. Data streams can originate in real-world devices or they can be outputs of Service Objects deployed in the platform.

Advanced streaming and analytics platforms such as ServIoTicy are complex pieces of software that integrate a large set of components under the hood. They hide their complexity behind simple REST APIs and multi-protocol channels, but the reality is that their deployment and configuration is complex. ServIoTicy leverages Apache STORM runtime for parallel data processing, that combined with dynamic user-code injection provides dynamic stream processing pipelining.

We provide insights on the performance properties of ServIoTicy as an starting point for the construction of advanced cloud provisioning strategies and algorithms. The work presented here focuses on the processing topologies built in ServIoTicy, although some details about other platform components are also provided.

Security is one of the main concerns on IoT platforms because they deal with big amounts of sensitive data. Although the applied security policies are not in the scope of this work, there has been efforts in that matter. Each update contains prove-nance data including the data owners and the operations that has been applied. The

(45)

provenance data is used with a security policy manager to decide if an application can make use of the update.

The source code of ServIoTicy is freely available as an open source project2 in GitHub. The platform is also available for single node testing as a vagrant box, downloadable from a github repository3.

The main contributions are:

• A technique for user-code injection on a data stream processing runtime that allows for dynamic creation and execution of stream processing topologies. This runtime is the core of the ServIoTicy platform.

• Detail on the operator that composes multiple streams into a single composite stream.

• An insight on the performance of the code-injection technique, including re-sponse time end-to-end in a processing pipeline and across stages.

The next sections of the chapter are organized as follows: Section3.2introduces the general architecture and components of the platform; Section3.3introduces a set of abstractions defined in ServIoTicy for managing data associated to objects; Sec-tion3.4describes in detail the stream processing runtime of ServIoTicy; Section3.5

presents the evaluation methodology and the experiments; Finally, Section3.6goes through the related work and Section3.7provides some conclusions and future lines of work.

3.2 Architecture of ServIoTicy

The Front-End of platform is a Web Tier that implements the REST API that sits at the core of ServIoTicy. The API contains parts of the logic of the Service Objects and Data Processing Pipelines, related to authentication, data storage and data retrieval actions. The Stream Processing Topology is responsible for the execution of the code associated to Data Processing pipes as well as the forwarding of data across Service Objects and to external entities (e.g. external subscribers that want data forwarded

2_{https://github.com/servioticy}

(46)

on real-time using a push model on top of MQTT or STOMP). Finally, the data Back-End includes the Data Store that provides scalable, distributed and fault-tolerant properties to ServIoTicy, and the Indexing Engine that provides search capabilities across sensors data using different criteria, like timestamps, string patterns or geo-location. In this section we describe in more detail the main properties of each com-ponent of the ServIoTicy architecture.

3.2.1 Web Tier

The Web Tier for the REST API is composed of a Servlets Container and a REST En-gine. As a HTTP Web Server and Java Servlet container we use Jetty [62]. Jetty is often used for machine-to-machine communications, usually within larger software frameworks. As a JSON processor we use Jackson [61], which is a high-performance suite of data-processing tools for Java, including the flagship JSON parsing and gen-eration library, as well as additional modules. The Jackson Project also has handlers to add data format support for JAX-RS implementations like Jersey.

3.2.2 Stream Processing Topology

The Stream Processing Topology is implemented on top of Apache STORM [96], which is a state-of-the-art stream processing runtime. Out-of-the-box, STORM pro-vides the availability to build topologies composed of spouts (sources of data) and bolts (processing units). Topologies are static after their deployment, and data keeps flowing through their bolts until the topology is stopped. STORM provides auto-scaling capabilities that make it particularly suitable for cloud deployments. Note that in case that a different topology is needed, the user needs to stop the running topology and deploy the new one. This situation will not affect the final platform, as it will be explained in more detail in following sections. The Stream Processing Topology also requires the support of a queuing system that will act as the spout for the STORM topology. In ServIoTicy, this is implemented using Kafka [98].

(47)

3.2.3 Data Store

A distributed data store is used to keep track of all the object produced data. For that purpose, CouchBase [42] has been chosen as the data store because it provides the benefits of NoSQL data stores (highly distributed, high-availability properties, scalable), and it is document oriented (which fits well for many different data sources and formats). Couchbase has native support for JSON documents. The definition of all Service Objects in ServIoTicy and their associated streams are stored as JSON documents in Couchbase.

3.2.4 Data Indexing

The search infrastructure to resolve queries is provided by an underlying compo-nent that performs high-performance indexing and search operations. In particular Elasticsearch [47] is leveraged as it is one of the most powerful and extended search engines that can be integrated with scalable data back-ends (in particular Couch-base). The integration between Couchbase and Elasticsearch enables full-text search, indexing and querying and real-time analytics for variety of use cases such as a con-tent store or aggregation of data from different data sources.

3.2.5 Multi-Protocol Brokerage

In an attempt to make ServIoTicy platform more accessible to udevices, particularly those with less computing capacity or with more power constraints, the REST API is also reachable using other protocols and transports. In particular, STOMP over TCP and WebSockets, and MQTT over TCP are also available. All these features are implemented in ServIoTicy using a combination of newly developed bridges between components and Apache Apollo [12] as the core message brokering engine.

3.3 Abstractions used in ServIoTicy

Several abstractions are used in ServIoTicy to embrace the different entities involved in the existence of IoT ecosystems.

Scalable processing of aggregate functions for data streams in resource-constrained environments

B

ARCELONA

T

ECH

D

T

Scalable Processing of Aggregate

Functions for Data Streams in

Resource-Constrained Environments

Declaration of Authorship

Abstract

Acknowledgements

Contents

List of Figures

List of Tables

List of Abbreviations

Chapter 1

Introduction

1.1

Motivation

1.2

Contributions and Publications

Chapter 2

Background

2.1

General Concepts

2.2

Stream Processing Platforms

2.3

Big Data Architectures

2.4

Operations on IoT data

Chapter 3

Dynamically Pipelined Processing

for Composite Data Streams

3.1

Introduction

3.2

Architecture of ServIoTicy

3.3

Abstractions used in ServIoTicy