How graph databases started
the
multi-model
revolution
Luca Garulli
Author and CEO @OrientDB
“90% of the data
in the world today
has been created
in the
last two years alone
.”
- IBM
Just Data
Order #134 (Order) Luca (Provider) Commodore Amiga 1200 (Product) Jill (Customer) Monitor 40” (Product) Mouse (Product) Bruno (Provider)Data by itself has little
value, it’s the
relationship
between data that gives it
Top NoSQL categories
Key/Value Databases
Document Databases
Top NoSQL categories
Key/Value Databases
Why do most NoSQL products
avoid
Joins is the Evil
ID Name 10 John 11 John 24 Mike 28 Mike ID Address 10 24 10 33 32 44 ID Location 24 Milan 33 London 18 Paris 18 Madrid 44 MoscowCustomer CustomerAddress Address
A-‐Z
A-‐L M-‐Z
Imagine an
Address Book
where we want to find
Luca
’s phone number
A-‐Z A-‐L M-‐Z A-‐L A-‐D E-‐L M-‐Z M-‐R S-‐Z
Index algorithms are all
similar and based on
balanced trees
A-‐Z A-‐L M-‐Z A-‐L A-‐D E-‐L M-‐Z M-‐R S-‐Z A-‐D A-‐B C-‐D E-‐L E-‐G H-‐L
A-‐Z A-‐L M-‐Z A-‐L A-‐D E-‐L M-‐Z M-‐R S-‐Z A-‐D A-‐B C-‐D E-‐L E-‐G H-‐L E-‐G E-‐F G H-‐L H-‐J K-‐L
Index Lookup: how does it work?
A-‐Z A-‐L M-‐Z A-‐L A-‐D E-‐L M-‐Z M-‐R S-‐Z A-‐D A-‐B C-‐D E-‐L E-‐G H-‐L E-‐G E-‐F G H-‐L H-‐J K-‐L LucaFound!
This lookup took
5
steps.
With millions of indexed
Joins Kill Performance
ID Name 10 John 11 John 24 Mike 28 Mike ID Address 10 24 10 33 32 44 ID Location 24 Milan 33 London 18 Paris 18 Madrid 44 MoscowCustomer CustomerAddress Address
Joins are executed every time
you cross relationships
Querying million of records
joining 3-4 tables could
This is why the database
query performance
suffers as the database
increases in size
In a world that’s becoming
more connected, we need a
better way to store data and
manage relationships
“A graph database is any
storage system
that provides
index-free adjacency
”
- Marko Rodriguez
Every developer knows
the Relational Model,
but who knows the
Back to school:
Basic Graph
Vertices and Edges can have properties
Vertices are directed
* https://github.com/tinkerpop/blueprints/wiki/Property-‐Graph-‐Model
Property Graph Model*
Sao Paulo
people: 12,000,000Luca
company: OrientTechnologies
Vertices and Edges can have properties Vertices and Edges can
have properties
Visited
Luca
Sao Paulo
Visited
on: 2015
An Edge connects only 2 vertices Use multiple edges to represent 1-‐N
and N-‐M relationships
Worked
on: 2015
The Graph theory
is so simple,
How does a
true*
Graph
Database
manage relationships?
Luca
Sao Paulo
Visited
on: 2015
#13:55 #15:99
Each element in the Graph has own immutable Record ID
#22:11
(Edge) (Vertex)
(Vertex)
Each element in the Graph has own immutable Record ID
A Graph Database creates the
relationship
just once
(when the edge is created)
VS
RDBMS computes the
relationship
every time
When you move from a RDBMS
to a Graph Database you jump
from a
O(log N)
speed to a near
O(1)
With a Graph Database, the
traversing time is
Graph Databases Easily Manage Complex
Relationships
No costs to traverse relationships:
• Recommendation engines • Social Applications
• Spatial Apps
GraphDB Database Quadrant
R e la ti o n sh ip s C o mp le xi ty > Data Complexity > Relational Key Value Column Graph DocumentThese were
1st generation
NoSQL
Oracle (RDBMS) Redis or Memcache (Key/Value) MongoDB (DocDB) Neo4j (GraphDB) Application ETL
1st Generation NoSQL: Scenario
1st Generation NoSQL: Fact
In > 90% of use cases,
NoSQL products are
Oracle (RDBMS) Redis or Memcache (Key/Value) MongoDB (DocDB) Neo4j (GraphDB) Application ETL
1st Generation NoSQL: Problems
- No standard between NoSQL
products
- Multiple vendors = multiple skills
- ETL + synchronization code
2nd Generation NoSQL
is
What’s Multi-Model DBMS?
Graph
Document
Object
Key/Value
Multi Model represents the intersection
What’s Multi-Model DBMS?
Graph
Document
Object
Key/Value
Multi Model represents the intersection
of multiple models in just one product
- Just
one product
to learn and maintain
- Just
one vendor
relationship to manage
- No ETL, no synchronization required
`
Vertices and Edges are Documents
{ ”@rid": “12:382”, ”@class": ”Customer", “name”: “Jill”, “surname” : “Raggio”, “phone” : “+39 33123212”, “details”: { “city”:”London", “tags”:”millennial” } } Jill Order Ma kes
General purpose solution:
•
JSON
•
Schema-less
•
Schema-full
•
Schema-hybrid
•
Nested documents
Polymorphic queries
Luca
(Provider)
Jill
(Customer)
SELECT * FROM Customer
SELECT * FROM Provider
SELECT * FROM Actor
There are a few DBMSs that claim
to be Multi-Model, but they do not
have a true Graph Engine.
The “Graph” is only a
layer
on top
of the engine.
Meet OrientDB
The First Ever Multi-Model
Database Combining Flexibility
of Documents with
With a true Graph, Document,
FEATURES ORIENTDB)) MONGODB NEO4J (RDBMS) MYSQL)
Operational Database X X X
Graph Database X X
Document Database X X
Object-Oriented Concepts X
Schema-full, Schema-less, Schema mix X
User and Role & Record Level Security X
Record Level Locking X X X
SQL X X
ACID Transaction X X X
Relationships (Linked Documents) X X X
Custom Data Types X X X
Embedded Documents X X
Multi-Master Zero Configuration Replication X
Sharding X X
Server Side Functions X X X
Native HTTP Rest/ JSON X X
Embeddable with No Restrictions X