ICT Cloud Computing, Internet of Services & Advanced Software Engineering, FP7-ICT

(1)

Engineering, FP7-ICT-2011-8

“Open-Source, Web-Based, Framework for Integrating Applications with Social Media Services and Personal Cloudlets”

Deliverable D3.5 OPENi Cloudlet Framework Design Document

Workpackage: WP3 – Design

Authors: Dónal McCarthy (WIT), Eric Robson (WIT), Gary McManus (WIT), Dylan Conway (WIT), Johannes Hange (FOKUS) Status: Final

Date: 23/09/2014 Version: 1.0

(2)

OPENi Project Profile

Contract No.: FP7-ICT-317883 Acronym: OPENi

Title: Open-Source, Web-Based, Framework for Integrating Applications with Social Media Services and Personal Cloudlets

URL: www.openi-ict.eu Start Date: 01/10/2012

Duration: 30 months

Partners

Waterford Institute of Technology

Coordinator

Ireland

National Technical University of Athens (NTUA),

Decision Support Systems Laboratory, DSSLab

Greece

Fraunhofer-Gesellschaft Zur Foerderung Der

Angewandten Forschung E.V Germany

INFORMATICA GESFOR SA (CGI) Spain

AMBIESENSE LTD UK

VELTI SA Greece

BETAPOND LIMITED Ireland

(3)

Version Date Author (Partner) Remarks

0.1 25/09/2014 Dónal McCarthy (WIT), Johannes Hange (FOKUS),

Dylan Conway(WIT) First draft.

1.0 30/09/2013

Dónal McCarthy (WIT), Johannes Hange (FOKUS), Dylan Conway(WIT), Gary McManus(WIT), Robert Kleinfeld (FOKUS)

(4)

Executive Summary

As a research project OPENi’s primary objective is to produce an innovative solution that integrates personal data storage and cloud-based services. To reach this goal we will develop a solution composed of two distinct, independent, but interrelated components 1) an API Platform to enhance access to cloud based services and 2) a Cloudlet framework to store users’ data. Applications can access cloudlet data independent of the API platform, likewise applications can utilise the API platform without accessing Cloudlets.

There are many services available that offer personal data storage similar to OPENi’s Cloudlet. We profile and analyse a number of these services to ascertain the De facto industry standards with regard to data privacy, data control, and interoperability with 3rd_{party apps and services. Cognisant} of existing solutions in OPENi we allow users store any type of data in any schema (or object type). This dynamic data approach makes the Platform more appealing to 3rd_{party developers; however it} complicates matters with regard to the key OPENi goal of seamless interoperability between applications. To address this complication the OPENi Cloudlet Framework will apply a schema to the data retrospectively through the use of folksonomies with usage metrics dictating the global schemata in the OPENi registry.

The primary goal of OPENi is to give the user maximum control of their data. We address this through a number of technical solutions and privacy policies by:

 implementing non-intrusive logging,

 allowing users purge their data,

 enabling users realise the monetary value of their data,

 enabling them port their data to other platforms,

 and giving them control over 3rd_{party access to their data through the use of intuitive GUIs.} We aim to give insight into OPENi’s research agenda by formulating a number of research questions which we will answer over the lifetime of the project. The questions cover a number of thematic areas such as: mobile application interoperability, meta-processing and data discovery, data monetisation, personal identity, and minimal exposure. Building on the research question and use-cases we identified the components required to create an OPENi compliant cloudlet platform. The list of requirements can be split into two sections, one to deal with the management of the overall platform (monitoring, data aggregation, platform administration, provider GUI, and communications) and the other deals with the individual cloudlets (data access, management, authentication notification, and Cloudlet GUIs).

(5)

The platform will be developed using some agile software techniques including: SCRUM an iterative software development process, and test driven development.

(6)

Table of Contents

1 Preface – OPENi High Level Architecture Overview ... 12

1.1 Introduction ... 12

1.2 Security framework ... 13

1.3 API framework ... 13

1.4 Cloudlet framework ... 13

1.5 Mobile Client Library ... 13

1.6 Internal Interaction and Interoperability ... 14

1.7 Service Enablers ... 14

2 Introduction ... 15

2.1 Purpose and Objectives ... 15

2.2 Methodological Approach ... 15

2.3 Changes in Phase 2 D3.5 ... 16

2.4 Document Structure... 16

3 Personal Data Storage Market Analysis ... 17

3.1 Personal Data Storage Solutions ... 17

3.1.1 CAYOVA ... 17 3.1.2 FreedomBox ... 17 3.1.3 Gigya ... 18 3.1.4 Personal ... 18 3.1.5 Mydex ... 18 3.1.6 OwnCloud ... 19 3.1.7 Pidder ... 19 3.1.8 Privowny ... 20 3.1.9 Qiy ... 20 3.1.10 PAOGA ... 21 3.1.11 SocialSafe ... 21 3.1.12 openPDS ... 22 3.2 OPENi’s Position ... 22

3.3 Commercial Incentives for Deployment and Platform Bootstrapping ... 24

4 OPENi’s Personal Data Storage Research ... 26

4.1 Research Questions ... 26

4.2 Relevant Requirements and Use Cases ... 26

(7)

6 Cloudlet Framework Components ... 30

6.1 APIs ... 30 6.2 Data Storage ... 30 6.3 Management ... 31 6.3.1 Monitoring ... 31 6.3.2 Data Aggregation ... 31 6.3.3 Administration ... 32 6.3.4 Provider GUI ... 32 6.3.5 Communications ... 32 6.4 Cloudlet Management ... 32 6.4.1 Data access ... 32 6.4.2 Management ... 33

6.4.3 Authentication, Authorisation, and Accounting ... 33

6.4.4 Notification ... 33

6.4.5 Cloudlet GUIs ... 33

6.5 Component Interaction... 34

6.6 Research to Component Mapping ... 34

6.7 Use Case to Component Mapping ... 35

(8)

8.1.5 Load Balancer ... 53 8.1.6 Cloud Platform ... 53 8.1.6.1 OpenStack ... 54 8.1.7 Orchestration ... 54 8.1.7.1 Chef ... 54 8.1.7.2 Ganglia ... 55 8.1.7.3 Nagios ... 55 8.1.8 Deployment ... 55 8.1.8.1 Docker ... 55

8.2 Cloudlet Framework Implementation ... 56

8.2.1 Web Server ... 56

8.2.2 API Endpoints ... 57

8.2.3 DAO ... 57

8.2.4 DataStore ... 57

8.2.5 Search ... 58

8.2.6 Logging & Auditing... 58

8.2.7 Aggregator ... 58

8.2.8 Framework Management ... 58

8.3 Mobile Client ... 58

8.3.1 Native vs. Web Mobile Development ... 58

8.3.2 Mobile Client Libraries ... 59

8.3.2.1 jQuery Mobile ... 59

8.3.2.2 Titanium Mobile ... 59

8.3.2.3 Xui ... 59

9 APIs ... 60

9.1 Authentication and Authorization API ... 60

(9)

10.2 Alter or read cloudlet data ... 63

10.3 Application creation ... 63

10.4 User alter Cloudlet data automated... 64

10.5 3rd_{party access data (aggregator) ... 65}

11 Development Techniques ... 67

11.1 SCRUM ... 67

11.2 Test Driven Development ... 67

11.3 Vagrant ... 68

12 Conclusions ... 69

13 Acronyms ... 71

14 References ... 73

15 Appendix I ... 80

16 Appendix II: Data Storage ... 83

16.1.1 Relational Vs. NoSQL ... 83 16.1.2 NoSQL Datastores ... 85 16.1.2.1 Accumulo ... 85 16.1.2.2 Cassandra ... 86 16.1.2.3 CouchDB ... 87 16.1.2.4 Couchbase ... 87 16.1.2.5 HBase ... 88 16.1.2.6 MongoDB ... 89 16.1.3 Conclusion ... 90

17 Appendix III: Cloud Framework ... 92

17.1 Infrastructure as a Service (IaaS) ... 92

17.1.1 CloudSpaces ... 93

17.1.2 Eucalyptus ... 93

17.1.3 mOSAIC ... 94

(10)

(11)

List of Figures

Figure 1 OPENi Platform Architecture ... 12

Figure 6: Personal data storage services similar to OPENi’s ... 23

Figure 7 The Cloudlet Frameworkcomponents and external entities ... 30

Figure 8 A visualization of the OPENi’s components interconnection. ... 34

Figure 9 A single inheritance model with the ability to extend types ... 38

Figure 10 A single and multi-inheritance model. ... 39

Figure 11 Structural relations between types. ... 40

Figure 12 JSON-LD used for duck typing... 41

Figure 13 A depiction of scopes ... 42

Figure 14 Cloudlet and registry interaction ... 47

Figure 15 Cloudlet and registry interaction. ... 48

Figure 16 Cloudlet Frameworkcomponents and the ZeroMQ topology.Error! Bookmark not defined. Figure 17 Subscribing to Third Party Application. ... 62

Figure 18 Read/Update Cloudlet Data ... 63

Figure 19 Registering Third Party Applications. ... 64

Figure 20 Automatic Cloudlet Data Update ... 65

Figure 21 Third Party Data Access. ... 66

Figure 22 CAP theorem with real world examples. ... 84

Figure 23 Separation of Responsibilities [45] ... 92

(12)

1 Preface – OPENi High Level Architecture Overview

1.1 Introduction

The OPENi platform is composed of four distinct but interrelated components: 1) a security framework that manages users' privacy by providing advanced authorisation and access control, 2) a Graph API Framework to enhance access to cloud-based services, 3) a Cloudlet framework to store users’ data, and 4) a mobile client library which enables developers to create OPENi enabled mobile applications. While a number of these components can be used in isolation, they are all presented to mobile developers as a single platform through a unified client library and single suite of REST endpoints. Internally within the platform the individual frameworks also communicate with each other through REST endpoints.

(13)

1.2 Security framework

The security framework is responsible for the overall security of the OPENi platform. Its access control functionality affords users more control over their personal data and the cloud-based service which they interact with; it ensures the user's privacy wishes are adhered to. As you would expect with a platform dealing with personal data, the security framework is tightly integrated with the other components of the platform. The client library has a number of permission dialogs that allow end users to interact with their privacy settings on the security framework. The cloudlet framework verifies all access requests for data within the cloudlets against the security framework, as is the case with the graph API framework's functionality.

1.3 API framework

The OPENi API framework is an open framework that is capable of interoperating with a variety of cloud-based services, abstracting the integration challenges to a single open standard without losing any service features. It is a single framework that will promote innovation by offering application developers an advanced framework that will enable them to design and build complex applications involving the combinations of independent cloud-based services. Figure 2 depicts the OPENi API framework, the details of which are discussed in Deliverables D3.1 and 3.4.

1.4 Cloudlet framework

The OPENi cloudlet framework will provide application consumers with a single location to store and control their personal data. The cloudlet, in conjunction with the security framework, will empower application consumers to remain in control of their data. The control mechanisms will be inherently secure and trustworthy. As an open technology, validated by the open source community, consumers will be assured their data is not being used without their consent. The OPENi Cloudlet framework Standard defines a number of key components that make up the Cloudlet framework. These components are described in detail in deliverables D3.2 and D3.5.

(14)

also adds a significant number of enhancements including a number of security mechanisms and UI components.

1.6 Internal Interaction and Interoperability

The combination of the open API and cloudlet concepts makes OPENi both powerful and beneficial for consumers, application developers, and service providers. The vision for OPENi is to provide a platform that could be deployed and operated by many different application hosting or service providers looking to add value to their proposition. These ‘OPENi hosting providers’ will take advantage of various facets of the OPENi platform in ways that best suit their business model. From the normal consumers perspective we envisage that they will be largely unaware of the OPENi platform, with only the technology and data aware consumers cognisant of OPENi's existence. To that effect a scenario may arise where consumers could have many applications provided by many ‘OPENi-hosting-providers’. In practice we are planning for consumers to have a single cloudlet that can be connected to multiple API and security frameworks. To that end we designed an internal architecture where the three major on-platform feature-sets are decoupled into standalone frameworks (API, Security, and Cloudlet). Internally within the platform each framework exposes its services to the others through REST APIs and an accompanying client library.

Internal to the platform, the API framework persists data to the cloudlet framework through its REST endpoints, and is treated identically to an API framework installed elsewhere, i.e. once the appropriate security framework verifies that the user consents to the action, then the interacting API framework can manipulate the Cloudlet data. In essence the API framework is treated like any other 3rd party application which can interact with user's Cloudlet data. However, it differs in the following regards: 1) most of the time it happens to be located on the same server, 2) it acts on behalf of 3rd party applications persisting data related to the applications interaction with cloud based services, and 3) while 3rd party app developers can implement Cloudlet data schemas of their own, the Graph API's schema will be the most widely used on the Cloudlet framework, it will form the core of the Cloudlets folksonomy driven data model.

1.7 Service Enablers

(15)

2 Introduction

2.1 Purpose and Objectives

Deliverable 3.5 will specify the OPENi cloudlet framework that will be utilised by users to create and deploy their cloudlets but also by developers to enable their applications to access stored user data. It outlines necessary components for secure storage of cloudlet data, regulating the operation of the entire cloud platform, and enabling user interaction (create, deploy, update, delete) and application communication. Additionally all readily available components and technologies will be identified for integration with the Cloudlet framework.

2.2 Methodological Approach

With the main objective to define and specify the OPENi cloudlet framework, the work carried out for the present report is based on the WP2 results and particularly bears the following steps:

Baseline Analysis

Identifying the existing state of art through market analysis of OPENi competitors.

Evaluating suitable candidates from relevant technologies for the cloudlet framework in the context of its defined components.

Elaborating on research questions.

Preparatory Analysis

Ascertaining the essential concerns required to fulfill user requirements. Discussing the implementation of the platform and exploring all choices. Identifying actors, components and use cases.

Recognising key developments methods facilitating quality of components.

Iterative Specifications

Classifying the communication and connections between components. Defining the data storage mechanisms.

Defining the cloudlet management tools.

Specifying a suitable data model for persistence.

(16)

2.3 Changes in Phase 2 D3.5

Minor changes since the previous version of the document (D3.2) can be found around the document. To facilitate the reader in identifying and focus on the content that has altered significantly we list them here:

 A new introduction to the high level architecture of the platform in Chapter 1.

 In Section 3.1 has been altered to take into account changes within the services offered over the 12 months since the last market analysis was completed. Included are three additional services PAOGA, SocialSafe, and openPDS.

 We added a section 3.3 that outlines how the cloudlet bootstrapping issue is tackled and also describes scenarios in which the whole OPENi platform can be deployed in a commercial sense.

 Chapter 5 has been added to further describe how the component parts integrate to represent a single platform.

 Section 6.5 and the Appendix have been updated to reflect changes in the technologies that we selected to implement specific components.

 Docker containers as a deployment scenario have been added in section 8.1.8.1.

 Subsection 8.2 describes how the Cloudlet framework components outlined in section 6 are implemented in relation to the technologies described in section 8.1.

2.4 Document Structure

(17)

3 Personal Data Storage Market Analysis

There are many services available that offer personal data storage similar to OPENi’s Cloudlet. In this section we profile and analyse a number of these services to ascertain the De facto industry standards with regard to data privacy, data control, and interoperability with 3rd_{party apps and} services. We also gauge their accessibility and ease of use. The services that we analyse include: CAYOVA, FredomBox, Gigya, Personal, Mydex, OwnCloud, Pidder, Privowny, Qiy, PAOGA, openPDS, and SocialSafe. Later we outline OPENi’s position with regard to each of these standards. Additionally we outline OPENi’s key innovations and discuss its key points of differentiation.

3.1 Personal Data Storage Solutions

The following are high level descriptions and analyses of Personal Data Storage solutions in general terms. Security and privacy are mentioned, however for a more in depth security analysis please see section 5.1 in D3.3.

3.1.1 CAYOVA

Cayova, which was detailed in the first version of this deliverable, and was released as a beta service in April 2013 has since been discontinued as a service. However, as a replacement there is a Chrome plugin available, called CAYOVA box. This plugin blocks advertiser networks from seeing what you are looking at online. It does this by preventing the advertiser networks from placing cookies onto your computer and therefore preventing them from seeing which websites you are visiting or which items you are looking at. It also informs you of what services have requested access to your browsing history, and can from this information provide you with an overview of your online profile.

3.1.2 FreedomBox

(18)

3.1.3 Gigya

Gigya is a commercial entity, based in California, USA that provides business solutions for connected consumer management. Through their Connected Consumer Management Suite they offer business customers the ability to understand and connect more closely with today’s mobile and socially connected users, and therefore offer these users a more targeted service. Gigya specialises in gathering, transfer and storage of user information and through their products Gigya is then able to provide their clients with the rich data, intelligence and tools to reach consumers with relevant messages. These technologies help businesses to access, consolidate and manage the permission-based identity and behavioural data and in turn transform this data into targeted actions. Through this functionality Gigya has acquired a large list of multinational clients and partners.

While Gigya follows industry practice with regards to data security such as firewalls, intrusion protection and encryption, they do state that data collected and stored is subject to the rules of their client and they merely offer a collection and storage facility.

Since the last analysis Gigya has released a new tool, ‘Consumer Insights’, which aggregates the data from previous products and provides the customer with visual insights to user identities and their online behaviours. It allows marketers to query Gigya’s Identity Storage database and ties identity information with internal indicators to understand what information are driving various behaviours such as purchasing, commenting or sharing.

3.1.4 Personal

Personal is a web and mobile data vault that provides a private, cloud repository for users to store and share sensitive data, passwords and files, giving these users absolute, password protected control over their identities and data. The system is built on a privacy- and security- by-design platform and allows its users to leverage their data and to realize it to its maximum value. Third party developers only have access to the data that has been stored through their applications or has been specifically shared with them. The platform has a centralized approach for storing data in Rackspace, USA.

Since our last version of Personal review they have released a new application on their platform called, ‘Fill It’, which is a 1-click form filling app for logins, checkouts and various types of forms to enable automated and secure form-filling. It offers a library of nearly 140,000 indexed online forms that are auto-filled on the applicable website from the users own pre-saved data. When filling out forms FillIt enables the user to save their data , their passwords or even payment information (if they want) when completing the forms to their user data vault, and this information can then be used again in future forms that require the same information, thus preventing the users from re-entering the same information again and again.

Personal is a proprietary initiative, and is offered to users as monthly or annual based subscription service

3.1.5 Mydex

(19)

used across the system, with security access and control of the data store realized through a range of authentication options that vary from a simple username and password up to multifactor authentication methods that may even include biometrics. This means that the data store can only be accessed by the user and only they hold the key. It is similar to the Personal platform as described above, but it has no interface for developers to develop third party applications, and they can only create connectors to the platform in order to sync or retrieve information (e.g. profile information or user’s address); there is no flexibility for expanding the platform’s capabilities through applications, but only interfaces to the platform. Mydex explicitly states in a charter that there is no promise of data ownership, while Personal verbalise users do own their data. They do however commit to not allowing third parties access to your data without consent.

Mydex is under a Creative Commons license and state that all technologies that they use are open source; however they do not make the software running on their cloud platform available publicly. 3.1.6 OwnCloud

OwnCloud is a cloud storage platform for business enterprise that provides businesses with their own file sync and share facilities. It offers businesses the opportunity to provide their own cloud storage service, on their own internal infrastructures, to employees with a larger degree of control than existing public cloud service providers. ownCloud uses a completely open source solution and as such is transparent and is extensible through the fact that it uses open standards to allow ease of integration to a company’s existing services.

OwnCloud provides services similar to Dropbox, except that it affords greater control for organisations concerned with the confidentiality of their information. Due to the open source nature of the project organisations can add, remove, and modify any features they want and the organisations are responsible for setting their own security and privacy policies. Each business is also responsible for the how the data is managed with the businesses being the owners of the data.

ownCloud incorporates a file firewall, SAML/Shibboleth authentication, full logging (for auditing), and support for a number of SQL databases. All of which can extend ownCloud beyond the corporate DMZ and offer cloud-based file sync and share services to external or mobile users, which in effect allows those hosting an internal ownCloud solution to transform that solution into a true cloud services offering.

Since we introduced ownCloud in D3.2 the ownCloud company has released ‘ownCloud Documents’, which is a collaborative editing tool for rich-text documents. These documents can be created from within the tool or existing documents can be uploaded and used.

(20)

centralised privacy-by-design platform with all data stored an encrypted form in relational databases on the Pidder servers, with the default status being access only being granted to the user as a default, and the user then opens up their data according to their preferences.

Pidder locks themselves out of the data by encrypting the data on the client side, and for additional privacy they do not log IP addresses of those accessing the service. This client side encryption helps prevent unauthorised access to data internally from within Pidder.

There have been no major updates to this service since our introduction of this service, however, the service has moved from a .com domain to a .de domain following revelations in the past that the USA is claiming jurisdiction over all .com domains, regardless of the server location

3.1.8 Privowny

Privowny is a browser plug-in to control cookies and spam, manage your passwords and auto-login details with state of the art client-side encryption, and securing all data that you leave on websites. It is a way of capturing your digital footprint, in your own PrivownySphere, that includes data submitted in web forms, preferences you set on websites and data from companies’ cookies, and then allows you to manage this information and market it in a way that suits you.

Privowny is a commercial digital privacy company based in the US whose goal is to empower users by providing a digital memory of all information gathered about them. The system creates a record of all data given; knowingly or not; to third parties while browsing and allows users to discover companies that have shared their data through their Privowny account. Privowny is a deployed on centralised platform and Amazon's Web Services is used to store the digital footprints.

The crux of their privacy policy is that you own your data and third parties will not be given access to your data.

The browser plugin which gathers the data can be accessed from a built-in web interface with all interactions with the web interface and plug-in utilising HTTPS. Future developments in Privowny include the development of a number of APIs that will allow read/write access to the data based on permissions provided by the user. APIs will also be developed to allow companies to push data they already have gathered about you into you PrivownySphere.

3.1.9 Qiy

Qiy (pronounced ‘key’) is a privately funded, non-profit foundation, whose stated goal is to return control of personal data to the individual. The user’s independence is fostered through a technical solution of strong encryption across all communications and data storage as well as a series of legal and policy implementations. The focus of QIY is to aggregate a user's personal data and allow them greater control over who has access to it.

The QIY privacy policy states user ownership of the data, and commit to not sharing information with third parties without permission. Your data is encrypted, thus preventing Qiy staff access to your data, and then stored across a number of certified datacenters across the Netherlands, which are protected by access controls and firewalls, thus preventing unauthorised access.

(21)

3.1.10 PAOGA

Since the last analysis that was carried out for D3.2 the Personal Information Management Services sector is accelerating with new services being released on a regular basis. Therefore, we briefly introduce another player that is coming on strong and gaining a lot of traction in this space.

PAOGA (People Are Our Greatest Asset) is a UK company that have been developing a platform to provide individuals or organisations with a secure repository to enter personal information and data, documents and files to access, update, store and share under their control for their own benefit. They are providing an individual with their own secure digital safe deposit box, which means that they can impose their own rules as a condition of sharing their personal information with a company with whom they want to have a relationship. This system is now in the final stages of beta testing.

Once you have registered with PAOGA and downloaded their app you are provided with your PAOGA Personal Cloud as well as unique certificate (key) which is stored on your computer and will be required for you to access this Personal Cloud as well as all the personal and private data, documents and files that you are protecting. PAOGA does not have a copy of your key, nor access to your cloud and therefore data.

Within PAOGA all of the private information, data, documents and messages are uniquely and strongly encrypted in ‘PAOGA Containers’ to the highest level automatically when stored, shared and transmitted. The user does not store their data with PAOGA, but chooses an alternative cloud storage platform such as Microsoft Azure, Amazon S3 or Dropbox, etc. and PAOGA provides individuals with an audit trail which includes when data changes were made. All of the individual’s information and documents are encrypted with the 'key' which is held by an independent Trusted Third Party and only accessible with the permission of the individual.

3.1.11 SocialSafe

SocialSafe, released in November 2010, aims to help users unite their Social Networks and safeguard their data from partial or total loss. SocialSafe allows their users to create a searchable offline journal populated by their online social media accounts. It is a program that resides on one or more devices, and directly accesses and downloads a user’s distributed cloud data to their own personal library on their device – or on their personal cloud (Dropbox, Google Drive, MS SkyDrive, NAS, etc). The user can select which aspects of each service that they would like to backup from the cloud, and aggregate all of them into one place that can be browsed or search through. This library is 100% private and is an aggregated, cross-referenced library of all the user’s data.

(22)

3.1.12 openPDS

OpenPDS is a platform that is being developed at the MIT Media Lab through the Human Dynamics group and it provides users with a Personal Data Store (PDS) that gives the ability to collect, store, and provide fine-grain access to various parts of their data while still protecting their privacy. It can be used to consolidate user data to a single location, that can be on their computer or in the cloud with a 3rd party service provider, and allows you to decide what is shared, with whom and where. OpenPDS deals specifically with metadata, which can be used to describe a person’s location, phone use or Web searches. OpenPDS sits between the application looking for data and your information, and the application must query your data store directly for what it is looking for instead of providing the app direct access to all your raw data. For example, a mobile app, Web site or research firm looking for information protected by an openPDS must query the data store directly—to check, for instance, whether your shipping address has changed or to confirm your present location. OpenPDS responds specifically to that query with answers that the openPDS owner approves for release. It also provides users with the ability to inspect their data at any stage, and monitor the usage of their data.

3.2 OPENi’s Position

(23)

Figure 2:This diagram compares personal data storage services similar to OPENi’s under a number of categories. Based on our perception, each company is placed along a scale from least to most in each category. The diagram shows that OPENi is grouped with the companies that allow their users the most control, it is the most interoperable, and it has the most support for dynamic data. However to lead in these areas some privacy features are sacrificed; consequently other services overtake OPENi in this regards.

The services that we analysed took a varied approach to the type of data that they allow users store on their system. Some are restrictive and only store data from a single domain or data in predefined schemas/structures. Examples are: FreedomBox and Pidder primarily concerned with communications and social networking data, CAYOVA and Privowny concentrating on web browsing data, OwnCloud

on enterprise data like contacts, calendar and documents. Other services give their users more control over the type and structure of data that is stored. Included in this group are Mydex and

Personal which allow their users define custom data structures, and Gigya which integrates with many

3rd_{party services.}

(24)

directly integrating with mobile applications. However OPENi is the most accessible to 3rd_{parties as it} allows users share their data across applications and cloud-based services. This presents a wealth of data that application developers can tap into and enhance their service with. In addition OPENi also boasts an aggregation feature (that allows users monetise their data in a privacy preserving way (see sub-section 5.3.2)) which gives 3rd_{parties access to information composed of data from many} cloudlets.

The primary goal of OPENi is to give the user maximum control of their data. We address this through a number of technical solutions and privacy policies. Similar to Personal and Pidder OPENi will implement non-intrusive logging. It will allow a user purge their data from the system as do CAYOVA,

Personal, Pidder and FreedomBox. Similar to CAYOVA we will allow a user to realise the monetary

value of their data by rewarding them for sharing it with 3rd_{parties. Uniquely it will allow users port} their data to other platforms, and give them control over 3rd_{party access to their data through the} use of intuitive GUIs.

All the services that we analysed heavily emphasised their security and privacy features.

FreedomBox’s view is that centralised systems are a privacy concern as the platform owners can

access all user data. To counter this perceived risk they implemented a distributed system where each user installs their software on a personal server in their home. FreedomBox was unique in this view, the rest opting for a traditional centralised platform. However their attitude to data protection and privacy differed. Privowny, Pidder, Personal, and Mydex utilise client side encryption to encrypt data before it is sent to their platform; consequently only the user can decrypt their data. In OPENi

we cannot take this approach as it would restrict mobile application interoperability. However we do understand that there will be instances where users will want to protect some data so we will allow users to encrypt their data on a per data-point basis. As a result this data will not be intelligible to the platform rendering it unusable to the application interoperability, and data aggregation features.

To further address users’ privacy concerns we’ll implement non-intrusive logging and provide a detailed privacy policy. This approach is similar to CAYOVA’s. In OPENi we’re going to implement the Cloudlet Framework on a centralised cloud based platform; however we took care when selecting technologies to choose lightweight options so that more tech savvy and privacy concerned users can install the platform on a personal server.

3.3 Commercial

Incentives

for

Deployment

and

Platform

Bootstrapping

(25)

a Backend as a Service (BaaS) for the enterprises suite of mobile applications. 2) A similar setup where a software house deploys an instance of OPENi as a BaaS for their many clients and 3) a single web scale deployment that any app developers or end user can sign up to.

Regardless of the deployment scenario there are two user groups, the developers (either in-house or freelance) and the end users (either enterprise or public), that the platform must deliver consistent value across the disparate environments.

For the developers OPENi provides a scalable user centric datastore, a group of advanced APIs and a cross-platform client library. By opting for OPENi the app developer can outsource many complexities around data transporting, storage, and regulations and concentrate right away on their applications logic. The OPENi platform reduces development time significantly.

For the end users the OPENi framework provides a location for them to safely store their data and secure mechanisms that afford them full control over how they share their data.

As with any platform that relies on data to provide certain services - bootstrapping is an issue for a number of the platform’s features. The initial platform strives to find a balance between features that can be used out-of-the box with features that make intelligent use of end user’s data. For example the interaction with the Graph API is highly valuable from the initial deployment with some of the features in the cloudlet framework becoming more prevalent as users add more ‘OPENi enabled’ apps and increasingly interact with their cloudlet. Through extended use the cloudlet is populated with more and more data it makes it easier for both the end user and app developers to reuse it. In the case of the aggregator feature it is only be useful when the platform is sufficiently full enough with cloudlets and cloudlet data that valuable generic results can be provided that mask the specifics of the user’s digital identity.

(26)

4 OPENi’s Personal Data Storage Research

In this section we aim to give insight into OPENi’s research agenda by formulating a number of research questions which we will answer over the lifetime of the project. The questions cover a number of thematic areas such as: mobile application interoperability, meta-processing and data discovery, data monetisation, personal identity, and minimal exposure; the questions are closely aligned with the research agendas outlined in D3.4 and D3.6. Later in this section we identify the requirements and use cases that will help answer these research questions.

4.1 Research Questions

The key overall research question for the OPENi Cloudlet Framework is as follows:

How should a scalable, extensible, secure Cloudlet Framework be developed in order to provide the ability to store users’ data for mobile Apps, social media add-ons, and enterprise level applications?

In order to address the overall research question (RQ) we need to carefully investigate various aspects of storing users’ data: user data unification and monetization, personal user space instantiation on the cloud, digital user-identity formation. The specific research questions that address these issues are as follows:

1. How should an open source Cloudlet Framework enable the instantiation of user spaces in the

cloud, with capabilities such as storage, discoverability, addressability, access, and security of users’ data across applications and devices?

2. How should potential differences in data representation by the 3rd party applications be

negotiated in order to facilitate data re-use and interoperability?

3. How should the Cloudlet Framework present data to enable convenient meta-processing;

both indexing and searching; to facilitate the user in monetising their data in a privacy preserving way?

4. How should the Cloudlet Framework for each individual user encompass and manage their

data (e.g. health, finance, legal data) in order to build their personal identity?

5. How should the Cloudlets as a user centric data store further the currently observed state of

the art HTTP based data access to promote privacy and enable a minimal exposure concept?

4.2 Relevant Requirements and Use Cases

Each of the research questions can be linked with the use cases of the OPENi project in a manner that reinforces the concepts outline in both the use cases and the research questions.

Research question one has distinct links with the scenarios of the MyLife and Personalised in-store shopping use cases. These use cases require a system that enables users to sign up to the service in an easy manner and for the system to create these accounts with all the accompanying configurations and features (storage, security, ect.).

(27)

system. The MyHealth scenario from MyLife has both users and medical specialists accessing and editing the same data. By utilising personalised advertising 3rd parties will use the OPENi platform to supply users with targeted ads through analysis of their accessible data. The personalised in-store shopping use case will see retailers supply OPENi with details about their products, stock and offers; it will also provide recommendations on products to users. As the research question states, all the data from these different sources need to be represented in such a way that re-use and interoperability are supported within the OPENi system.

The third research question focuses solely on the Personalised advertising use case of OPENi. The personalised advertising use case will allow 3rd parties utilise user data to create targeted ad campaigns. In keeping with the data protection and privacy ethos of OPENi the shared data will be anonymised. Users must opt in to the advertising programs to allow 3rd parties access to their data. The MyLife use case helps answer the fourth research question. The MyLife use case shows scenarios where users keep details about their health, transaction, stocks and more on the OPENi platform. This information allows users to build and expand their personal identity with the OPENi platform.

Similar to the second research question the fifth question is addressed by all 3 use cases. The OPENi platform will reduce the unwanted exposure of user data to all three services.

Research Question

Relation to OPENi Use Cases MyLife Use Case Personalised Advertising

Use Case Personalised In-store Shopping Use Case

Q1 High Low High

Q2 High High High

Q3 Low High Low

Q4 High Low Low

(28)

5 OPENi Platform Integration

In this section we outline how the three distinct frameworks that come together to make the OPENi Platform are presented to external actors as a single cloud based application.

Figure 3: Mongrel2 and Swagger Combiner linking the three frameworks

The three individual frameworks expose their functionality through swagger defined REST endpoints. Swagger allows developers describe their REST actions through a defined JSON structure. As a consequence of the generic nature of the Swagger JSON, a number of other tools have been built around it, including interactive documentation, client SDK generation, and discoverability.

(29)

request to the API framework which is running on Python Django via one port (8889) and it http proxies to the security frameworks Tomcat instance through a different port (8887). As the cloudlet framework is a distributed application its application code is spread over a number of workers running on a multitude of ports. These workers consume and produce ZMQ messages; Mongrel2 converts incoming http calls into specially formatted ZMQ messages and pushes them onto the ZMQ queue. It does the reverse with ZMQ messages it receives back from the workers.

As described above, the Platform appears to external entities as a single entity at a low level but at the application level it is still three distinct frameworks. However, by utilising REST endpoints with their inbuilt standalone statelessness, this fact is masked. At a high level, interoperability is attained by the use of security frameworks JSON Web Tokens (JWT) and interaction through each frameworks endpoints. As described in D3.6 all security related interaction is performed through the Security framework.

One of the features of the security framework is that it issues tokens to mobile applications which includes user defined access permissions for that app. These tokens also act as the glue that binds the security framework to the Cloudlet and API frameworks. All apps and third party services have to include these tokens as a header in all calls to the Cloudlet and API framework endpoints. The Cloudlet and API framework then verify the public key signed tokens and enforce the user defined access rules as defined within them. (Another benefit of these signed tokens is that the applications and 3rd_{party services can programmatically process them without altering them. They can then alter} their applications logic based on the permissions, e.g. request access to additional data, disable features that require data which the app was denied permission for, enable features that require certain data).

(30)

6 Cloudlet Framework Components

This section will describe the components required to create an OPENi compliant cloudlet framework. It is broken into two sections, one to deal with the management of the overall framework and the other deals with the individual cloudlets. For each component we define what it does and what other components it interacts with.

Figure 4 The Cloudlet Framework components and external entities

6.1 APIs

The APIs will provide the medium for inter-component communication in the Cloudlet and also for external communication with the API framework and with Apps. More details of the Cloudlet API can be found in section 8.

6.2 Data Storage

(31)

The data storage component of the cloudlet framework must be capable of accommodating binary files as well as text data.

6.3 Management

The platform providers will be responsible for managing the underlying resources, which serve the cloudlet framework. To enable the management of these resources the following components are crucial to the platform.

6.3.1 Monitoring

Automated monitoring of the cloudlet framework will offer providers the ability to pre-empt certain potential problems and efficiently react to many issues within the platform.

In conjunction with standard infrastructure metrics, logs of platform application actions such as creating cloudlets, inserting data and querying cloudlet data stores will be aggregated and analysed by the monitoring component to provide the platform provider with comprehensive information of the platform as a whole.

Alerts can be configured to notify the platform provider upon the occurrence of certain criteria e.g. available disk space less than 87%, CPU utilization greater than 98% etc.

6.3.2 Data Aggregation

The data aggregation (DA) component will offer 3rd_{parties the ability to view aggregated user data} from multiple cloudlets while concealing the individual cloudlet owner’s identity. A 3rd_{party will send a} request to the Cloudlet Framework for aggregated data. The DA will negotiate with the authorisation component to identify cloudlets that wish to share data with the 3rd_{party in a privacy preserving way.} It then requests the data from each cloudlet, aggregates the data, and sends the results to the 3rd party. The security access levels required to access user’s cloudlets is outlined in Deliverable 3.3.

(32)

Each platform will have a separate, independent data aggregation component. This component does not aim to integrate with the corresponding component on other OPENi Cloudlet Platforms.

6.3.3 Administration

Platform providers require the ability to initially set, and later adjust, the resources, control and communication settings of the framework in order to maintain a high quality and efficient platform for cloudlets. The administrative tasks include:

 Create various types of nodes such as database master node, database slave node, application node, component node etc.

 Add/Remove nodes on the platform

 Connect to a specific node in order to troubleshoot issue(s) reported by a cloudlet owner

 Global platform access control. Revoke the access tokens of a cloudlet owner in the event of an account being compromised.

6.3.4 Provider GUI

The provider GUI will serve as an interface for platform providers to carry out administrative tasks on the cloudlet framework and view data from the monitoring component such as:

 Manage the platforms data e.g. log entries, load balancer metrics and users

 Change notification settings from the monitoring component

 Carry out administration tasks defined in subsection 5.3.3

6.3.5 Communications

This component is responsible for communicating with the platforms users. It will incorporate an email and SMS service. Email communication is required to notify users of: registration progress, platform updates. Two way SMS communication is utilised to verify that registering users aren’t automated machines.

Users can also combine the communications and notification component to create alerts for cloudlet data mutations e.g. they can get an alert each time their weight is changed by an application.

6.4 Cloudlet Management

The components outlined in this section will facilitate the management of the individual cloudlets on an OPENi platform.

6.4.1 Data access

(33)

data access is outlined in more detail in D3.3 and the Cloudlet API is described in more detail in section 8 of this document.

6.4.2 Management

In an OPENi platform, cloudlet owners are promised full control of their cloudlets. Together with the cloudlet GUI component, the management component provides the individual cloudlet owner with high-level control of their cloudlets. Some common management operations a cloudlet owner can perform are:

 Creating and deleting their cloudlet

 Porting their data to a cloudlet framework on another platform

 Suspend 3rd_{parties’ access to their cloudlet data}

6.4.3 Authentication, Authorisation, and Accounting

Authentication and authorisation mechanisms are now handled by the security framework, however accounting or auditing is still handled in the cloudlet framework. The details of all access requests, subsequent actions and cloudlet responses will be monitored and logged by the accounting component. These logs will be available in the cloudlet GUI for the cloudlet owner to inspect.

6.4.4 Notification

Various components and external services can sign up for notifications of events on a user’s cloudlet.

6.4.5 Cloudlet GUIs

To empower Cloudlet owners in the management of their cloudlets they will be provided with GUIs. Some of the functions that will be available in this component include:

 Viewing access logs

 Edit preferences

(34)

6.5 Component Interaction

Figure 5 A visualization of OPENi’s component interconnection: the green components are GUIs, the red are external concerns, and the components with grey borders do not expose an external API. Storage is the central component. It is not exposed through an external API but rather through the authorization and authentication APIs. Most components alter different parts of the storage to some degree. The webserver is needed to serve the GUIs and permission dialogs.

6.6 Research to Component Mapping

The first research question focuses on user spaces within the cloudlet. It outlines important concerns such as instantiation, storage, access, and data security. The instantiation and storage will require the use of the user management, storage, notification, and data components. The data monitoring component will ensure the smooth running of the system from both a security and a data integrity point of view. The authentication and authorisation components will be utilised for ensuring that all access to a user’s data adheres to their data access rules. The users themselves will access the system using the user GUI.

(35)

how foreseeable data should be formatted then both the reuse and interoperability concerns can be limited.

The third of the research questions involves processing formatted data to facilitate monetizing in a secure manner. This will require the use of the authentication, storage, notification, and data components to allow for access in order to process the data. The users will be required to use the permissions dialog component to allow for the processing and monetizing of their data. The processing of the data will be accomplished using the data aggregator component.

The fourth research question focuses on allowing the user to manage their data on the cloudlet and facilitating the creation of their personal identity. To manage their data the users will needs access to it, requiring the authentication, storage, notification, and data components. The users will manage their data using the user management, user GUI, and permissions dialog components. The permissions dialog component allows the user to be selective about what elements of their personal identity that can be seen by others while the user management and GUI components enable the user to access their data and personal Identity.

The final research question focuses on the security and privacy of the cloudlet and how it is accessed. The cloudlets can be secured by the authentication, authorisation and blacklist components so that only resources accessible by users will be theirs.

6.7 Use Case to Component Mapping

Several of the components are integral to all the OPENi use cases. Each of the use cases requires some form of access to the data storage and notification components. Many of the key principals around OPENi focus on security and user privacy; therefore the authentication component is used for all use cases that require access to both user and system data.

MyLife - The MyLife use case is a broad use case which entails the storage and management of data from the users’ everyday life including photo, health, financial and messaging data. As previously stated this use case requires the use of systems data storage and each user will be required to authenticate with the system in order to gain access to it. As this use case requires users to sign up, add, and edit their user details the user management and user GUI components will be utilised. As this use case focuses on bringing many different aspects of everyday life together, with each potentially having their own data format it is important for mobile application interoperability that the registry identifies schemas in the data. Users will be required to set permissions on their MyLife resources to allow others to view or modify them using the permissions dialog component.

(36)

(37)

7 Data model

In OPENi we allow users store any type of data in their Cloudlet. This dynamic data approach makes Cloudlets more appealing to 3rd_{party developers; however it makes it difficult to achieve seamless} interoperability between applications which is one of OPENi’s key goals. To address this interoperability difficulty the OPENi Registry will apply a schema to the data retrospectively through the use of folksonomies.

The data model defines the systems capabilities to interact with and manipulate the objects and schemata. The Data API defines the manner in which the user can interact with the objects while the Type API defines the interaction with the schemata (or object types).

As the cloudlet is a user centric data storage concept, developers are able to use it to store data within the user’s domain of influence. In order to support all possible use cases a developer is able to imagine, the cloudlet must be able to store all possible objects a developer can define. Any object the developer would not be able to store as part of the cloudlet, would have to be stored externally and therefore outside the user’s domain of influence.

In OPENi, data is created by developers at will and at any time. This means developers are able to create data with the object type of their choosing. A key research goal of OPENi is the interoperability between applications and the ability to discover types and data of other applications. We will offer a type model under which developers may define their own types and discover those of others.

A common approach to achieve interoperability is standardization. A standard is created by parties which share a common field of interest and goals, but may differ in their approach to achieve them. The development of standards may take a long time and in the end may be too narrow [11] or not specific enough to provide a useful interoperability [12]. For these reasons, standardization is not a feasible concept for the OPENi type model. OPENi must be able to support interoperability with a more dynamic approach such as folksonomies.

Both dynamic data and interoperability are crucial to OPENi. Both have to be supported to enable the implementation of any developer use case while supporting seamless interoperability between applications. While researching class based type models we came to the conclusion that they may not be able to allow for a type ecosystem to grow dynamically. Following we will discuss the approaches and their shortcomings.

7.1 Inheritance

(38)

Figure 6 A single inheritance model with the ability to extend types. The extensions of types may add new properties to the base type. The corresponding objects can be also be casted and with smaller types. This is achieved by walking the inheritance chain (e.g. objectT+1 -> Type+1 -> Type0, therefore the object can be casted to T0). This concept is called subtyping and it enables OPENi to identify all objects which can be returned. A query for T0 can return object0, 1 and 2 as they all share the properties defined by T0.

The main motivation behind an inheritance model is twofold. It creates a way to relate types semantically to each other. Secondly, it provides a syntactic contract. Any object must provide the properties which are specified in the type. Likewise the inheritance from a parent to its child type guaranties that the properties of the parent are also preset in any object of the child type. This allows code to be written to these contractual expectations. The validity of the object to its type guarantees a certain amount of interoperability, by providing knowledge about the underlying data to the developer. It functions like a self-defined standard and shares some of the pitfalls. First, it only allows a static inheritance chain. This means a type does not change its ancestry during its lifetime. The parent is a fixed reference in the child class and likewise in the object. Secondly, once a property is defined, it cannot be undefined in a sub type. In order to get rid of the property, a type will need to fork the inheritance chain before the type was introduced. Thirdly, it only allows for chains of inheritance. Therefore, single inheritance is unable to express the duality of an amphibious vehicle by deriving it from the types boat and car. Multiple-inheritance allows a class (or object) to inherit from multiple parents (see Figure 7). However, multiple-inheritance creates a more complex ancestry graph, naming conflicts and violates the Liskov substitution model. Most languages today therefore avoid it.

(39)

Figure 7 A single- and multi-inheritance model. Boat1 is a Boat, it also it a vehicle, but its unable to be a car, as a car does not share semantic (other than being a vehicle) with a boat. Boat2 is both by having a type (amphibian) which inherits both classes. This is not possible in most programming languages due to the possible name clashes of properties (and the so called diamond problem).

7.2 Interfaces

(40)

Figure 8 The graphic visualizes the structural relations types can have to each other. A type can overlap with another type. These two types share common property names and may share a common property type. A type can be contained within another one. This means all of the properties of one type can be found in the other. They can also be equal, meaning they share all properties of each other. Lastly they can be distinct from each other, meaning they do not share properties with each other.

7.3 Multiple types

Both interfaces and classes provide a contractual description for objects. OPENis object types are closely related to interfaces as they do not carry implementations (like classes). When combining multiple interfaces (or types), the potential for naming conflicts exists. Figure 8 visualizes the different constellations types may have to each other when being combined. With the exception of the distinct case, all combinations have overlapping properties. These properties, if they share a common type are not problematic. However, if their types differ, the question which property should be used arises. An example related to Figure 7: Boat as well as Car may have a color property. If, the types of Boat.color and Car.color have a different type, which color declaration is to be used in

Amphibious? This is outlined in Example 2. A common solution to this problem is the ability to select

or rename attributes explicitly or to accept a declaration by convention (e.g. the right most type declaration is to be used). It is important to notice that multiple types do not necessarily imply multiple-inheritance. If a type inherits multiple types, this type can again be a parent to another type. This can lead to complex inheritance graphs. On the other hand objects with multiple may not suffer from complex inheritance graphs, as the inheritance model of these types itself may still be singular.

7.4 Duck typing

(41)

duck typing is a feature of dynamic languages, it does not suffer from naming conflicts. If a property is re-defined, it simply carries a different type from this point forward. This follows an analogous ideology to convention over configuration. Code may be receiving “faulty” objects at any time and the system will not assure how an object looks. This is a very dynamic model that would fit the goal of a dynamic type system for the OPENi cloudlet. However, it does not provide any facility for interoperability and is therefore unsuitable for an open type system. Developers could constantly be in conflict as to how an object should look. Figure 9 visualizes how duck typed objects can be perceived in a typed environment.

Figure 9 On the left a multiple type model. On the right a link data model like JSON-LD which can be used for duck typing. The right side types of attributes are declared in the object itself. The model does not support a class inheritance model, but requires type knowledge to be defined in the object. Objects can be very expressive in the right model; unfortunately they also convey a lot of duplicate information and no clear structure. JSON-LD does allow for the “liking” of other types as part of a class like structure, but does not define any enforcement or interoperability of these types (seen on the right most side). A typical single inheritance model is omitted but can be deduced from the left side.

(42)

derivative (child) type which would break the substitution model. Likewise shadowing can also apply to an object. Shadowing is depicted in Figure 10.

Figure 10 A depiction of scopes. A scope may have a property name and gender which may be redefined “shadowed” in its child scopes. The code sees the variables of the current scope and those of the parent scope, provided they have not been shadowed. This can be used in a type and object model as well.

7.6 OPENi

7.6.1 Types

Singular inheritance is a model which has proven suitable to programming. In programming languages, inheritance serves as a contractual assurance that enables subtyping and casting. It enables the system to return objects of a larger type even when a smaller type is requested as seen in Figure 6. However, a static inheritance model does not allow the developer to reduce an existing type freely and to get rid of unwanted properties. In order to find all compatible objects the subtype must exist in the inheritance chain. We will investigate whether explicit inheritance can be useful to the system. A typical inheritance model, demands that the developer knows the type hierarchy a-priory or refactors it later. Both seem unlike for OPENi, as multiple developers will use a type and will likely disagree on later changes or a-priory structures. Nobody, not even the initial developer must be able to alter a provided schema as long as other developers have applications based on objects of this type. Otherwise other developers would be unable to follow the changes and their applications could break. On the other hand, the introduction of compatible subtypes into an inheritance chain is possible. It would require the system to rewire the subtype’s parent relation and would create a new subtype.

(43)

automatically liked together by the system. After types are linked on a semantic level, the syntax the developer has provided can be used to build an implicit and changing inheritance model. The dynamic approach allows OPENi to continuously accept new types of any form and weave them into each other, semantically and syntactically, without the developers expressed intents. This allows the system to rewrite its inheritance model and to perform housekeeping on the type system.

An interesting challenge will occur as different developers will add a type with the same intention and therefore name, but with a different structure or attributes in mind. A type typically consists of human readable id and address to reference it. For an open type system it is not feasible to let developers name the schemata they provide. The names of schemata could only be used once and it would be unclear if the community agrees to the naming. The developer instead suggests a name, description and tags which are used as meta-data to guide the automated system. Therefore, the type contains semantic links and a syntactic structure. Additionally the developer is able to provide other hints such as tags, and type names as hints to the system. This provides a challenge as developers have to find, identify and use types defined by others. Even when developers agree on the syntactic structure of a type, they can provide additional semantic hints to the system.

Following is an example schema. The corresponding object is shown in 6.2. The OPENi type system allows the use of multiple types which are referenced by the object. As such, the object supports type composition without multiple-inheritance and can be composed of different so called traits. In this example, a person consists of personal information like the name but is enriched by address and birth information. These three schemata can also be combined into a single one, if the developer feels this is a common approach. As such, the type system prefers a composition over inheritance approach. This is but one approach, and we will further investigate other type and schema approaches during the development. Appendix I shows another.

A type is expressed via a URI (or IRI) and therefore uniquely identified. The objects and schemata are stored and indexed. An index over the types and identifiers of the objects and the schemata is necessary to provide adequate lookup speeds for the most common use cases.

type/afb5e73f7079d9ce805381a380bbf7e5” {

“@context”:{