• No results found

Domain driven design, NoSQL and multi-model databases

N/A
N/A
Protected

Academic year: 2022

Share "Domain driven design, NoSQL and multi-model databases"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

Domain driven design, NoSQL and multi-model

databases

Max Neunhöffer

Java Meetup New York, 10 November 2014

(2)

Max Neunhöffer

I am a mathematician

“Earlier life”: Research inComputer Algebra (Computational Group Theory)

Always juggled withbig data

Now: working in database development, NoSQL, ArangoDB I like:

research, hacking, teaching,

tickling the highest performance out of computer systems.

(3)

A typical Project: a Web Shop

The Specification Workshop

(need recommendation engine, need statistics, etc.)

The Developers get to work

. . .

(tables, relations, normalisation, schemas, queries, front-ends, etc.)

HANDOVER

(Why can I not . . . ? This is unusable!)

(4)

Solution: Agile Approach and Domain Driven Design

These days, many use (or try to use):

agile methods(Scrum, sprints, rapid prototyping)

withcontinuous feedbackfrom product owners to developers promisingless surprisesin deployment andhigh flexibility.

Domain Driven Design(Eric Evans, 2004):

identify aDomain(area in which software is applied) make aModel(abstract description of situation)

use aUbiquitous Language(that all team members speak) clearly define theContextin which the model applies.

Model your dataas close to the domain as possible.

Example:object oriented programming

(5)

Fundamental Problem: need a ubiquitous Language

Listening to team members, you hearcompletely different things:

Product Managers talk about

customers“browsing”through the shop,

powerful searchfor products (with the “good ones” up),

“useful”recommendations.

Developers talk about

tables, normalisation, queries and joins secondary indexes, front-end pages

object oriented, model view controller, responsive design

=⇒both groups think the others aremorons

(6)

The problem is rooted very deeply

functionalitynot gathered methodically

“obvious”functions are missing

no common language

misunderstandingsabout details

(7)

NoSQL: Richer Data Models are closer to the Domain

Some terms used by Evans as part of the ubiquitous language:

Entity: has anidentityandmutable state(e.g. a person) Value object: isidentified by its attributesandimmutable

(e.g. an address)

Aggregate: is acombinationof entities and value objects into one transactional unit(e.g. a customer with its orders) Association: is arelationbetween entities and value objects, can

have attributes, usuallyimmutable Consequences

These termscoming from the Domainmust bepresent in the Design. The whole team mustunderstand the same when talking about them.

(8)

Polyglot Persistence

Idea

Usethe right data modelforeach partof a system.

For an application, persist

an object or structured data as aJSON document, a hash table in akey/value store,

relations between objects in agraph database, a homogeneous array in arelational DBMS.

If the table has many empty cells or inhomogeneous rows, use acolumn-based database.

Takescalability needsinto account!

(9)

Document and Key/Value Stores

Document store

Adocument storestores a set of documents, which usually meansJSON data, these sets are calledcollections. The database has access to the contents of the documents.

each document in the collection has aunique key

secondary indexespossible, leading to more powerful queries different documents in the same collection:structure can vary no schemais required for a collection

database normalisation can berelaxed Key/value store

Opaque values, onlykey lookupwithout secondary indexes:

=⇒high performanceand perfectscalability

(10)

Graph Databases

Graph database

Agraph databasestores a labelled graph.Verticesand edgesaredocuments. Graphs are good to model relations.

graphsoften describe datavery naturally(e.g. the facebook friendship graph)

graphscan be stored using tables, however, graph queries notoriously lead toexpensive joins

there areinteresting and useful graph algorithmslike “shortest path” or “neighbourhood”

need agood query languageto reap the benefits horizontal scalabilityis troublesome

graph databasesvary widely inscopeandusage, no standard

(11)

A typical Use Case — an Online Shop

We need to hold

customerdata: usually homogeneous, but still variations

=⇒use adocument store:

productdata: even for a specialised business quite inhomogeneous

=⇒use adocument store:

shopping carts: need very fast lookup by session key

=⇒use akey/value store:

orderandsalesdata: relate customers and products

=⇒use adocument store:

recommendation enginedata: links between different entities

=⇒use agraph database:

(12)

Polyglot Persistence is nice, but . . .

Consequence:One needsmultiple database systemsin the persis- tence layer of asingleproject!

Polyglot persistence introducessome frictionthrough data synchronisation,

data conversion,

increased installation and administration effort, more training needs.

Wouldn’t it be nice, . . .

. . . to enjoy thebenefitswithout thedisadvantages?

(13)

The Multi-Model Approach

Multi-model database

Amulti-model databasecombines adocument storewith a graph databaseand akey/value store.

Vertices are documents in avertex collection, edges are documents in anedge collection.

a single, common query language forall three data models is able to compete withspecialised productson their turf allows for polyglot persistence usinga single database queries canmix the different data models

canreplace a RDMBSin many cases

(14)

A Map of the NoSQL Landscape

Map/reduce

Column Stores Extensibility

Documents

Massively distributed

Graphs Structured

Data

Key/Value Operational DBs

Analytic DBs Complex queries

(15)

is amulti-model database(document store & graph database), isopen source and free(Apache 2 license),

offers convenient queries (viaHTTP/RESTandAQL), includingjoinsbetween different collections,

strongconsistency guarantees usingtransactions ismemory efficientby shape detection,

usesJavaScript throughout(Google’s V8 built into server), API extensible by JavaScript code in theFoxx framework, offers manydriversfor a wide range of languages,

is easy to use withweb front endandgood documentation, and enjoysgood communityas well asprofessional support.

(16)

A Map of the NoSQL Landscape

Map/reduce

Column Stores Extensibility

Documents

Massively distributed

Graphs Structured

Data

Key/Value Operational DBs

Analytic DBs Complex queries

(17)

The ArangoDB Territory

Map/reduce

Column Stores Extensibility

Documents

Massively distributed

Graphs Structured

Data

Key/Value Operational DBs

Analytic DBs Complex queries

(18)

Strong Consistency

ArangoDB offers

atomic and isolated CRUDoperations for single documents, transactions spanningmultiple documentsandmultiple collections,

snapshot semantics forcomplex queries,

very secure durable storage usingappend onlyand storing multiple revisions,

all this fordocumentsas well as forgraphs.

In the (near) future, ArangoDB will

offer the same ACID semanticseven with sharding,

implementcomplete MVCC semanticsto allow forlock-free concurrent transactions.

(19)

Replication and Sharding — horizontal scalability

Right now, ArangoDB provides

easy setup of (asynchronous)replication,

which allowsread access parallelisation(master/slaves setup), shardingwith automatic data distribution to multiple servers.

Very soon, ArangoDB will feature

fault tolerancebyautomatic failoverandsynchronous replicationin cluster mode,

zero administrationby aself-reparingandself-balancing cluster architecture.

(20)

Powerful query language: AQL

The built inArangoQueryLanguageAQLallows complex, powerful and convenient queries, withtransaction semantics,

allowing to dojoins,

withuser definable functions(in JavaScript).

AQL isindependent of the driverused and offersprotection against injectionsby design.

For Version 2.3, we arereengineeringthe AQL query engine:

use a C++ implementation forhigh performance, optimisedistributed queriesin the cluster.

References

Related documents

This Oral Recording is brought to you for free and open access by the History at CORE Scholar. It has been accepted for inclusion in Dayton and Miami Valley Oral History Project by

Examination of the decoding center in 70S ribosome complexes upon binding of cognate or near-cognate tRNA in A-site showed that the key nucleotides A1493, A1492 and G530 of 16S rRNA

Desigur, dacă în privinţa îmbrăcămintei se cade să ne mărginim Ia stric­ tul necesar, tot astfel şi când e vorba de hrană, e de ajuns pentru întreţinere

For example, for channel 2 (no chopping) of chip 5 with an output dc offset of about 7 mV and for a 20 μA drive current, using Ohm’s law, this corresponds to a 350  error in

Bailey Glasser Receives Final Approval on $19.5 Million Settlement with Wilmington Trust to End ERISA Class

While social software employs the idea of shared spaces for communication/collaboration, most of the contemporary business process support

(4) Assignment Operators : The Assignment Operator evaluates an expression on the right of the expression and substitutes it to the value or variable on the left