Big Data & Scripting
storage networks and
distributed file systems
adaptivity: Cut-and-Paste
1• distribute blocks to [0,1] using hash function
• start withn nodes: n equal parts of [0,1] [0,1n]→N1, (i−n1,ni]→Ni
• adding a node:
– reassign stripe of width n+11 from each part to new nodeNn+1 – move blocks in these parts to Nn+1
– fuse new part together sorted by previous block assignment consider as one continuous interval
1Brinkmann, Salzwedel, ScheidelerEfficient, distributed data placement
adaptivity: Cut-and-Paste
• assumption: hash function yields uniform random distribution
• expected equal distribution of blocks to nodes
• node addresses can be determined without keeping table
– retrace splitting scheme
– ordering before fusing allows exact location
• not applicable in heterogeneous situations
• extensions to heterogeneous case exist2
2Brinkmann, Salzwedel, ScheidelerCompact adaptive placement schemes for
non-uniform requirements , 2002
data access
• until now: store and retrieve blocks
• distribution and redundancy by duplicates
• open: accessing this data
• possible guarantees/properties:
– parallel data access distributed among nodes
– synchronization with read operations (i.e. read always current state) – synchronization of write/update operations (e.g. which update
wins)
– performance:
latency = time until system reacts to request
bandwidth = transferable data per time unit (e.g. Mb/s)
update strategy determines guarantees and strategies for read access
data access: consistency and synchronicity
• redundancy →block update affects multiple nodes
synchronous updates
• sequence of read write accesses yields correct results example:
– write operationA1 affects one copy of a block Bi – next read access A2 reads different copy of Bi • all copies ofBi should be identical (synchronous)
consistency
• parallel updates should not lead to “alternative histories” example: simultaneous updates A1 and A2 to different copies of
blockBi [] two copies of Bi in different states • determine winner copy of Bi or prevent parallel update
data access: master copy
the master copy can be used ensure consistency:
• blockBi has r duplicates: Bi1, . . . ,Bir
• a single, distinguished copy is Bi’s master copy, sayBi1
• consistency is ensured by limiting write access to the master copy
• reading:
– use any copy
reading in parallel is allowed
• writing:
– write only to master copy – update duplicates from master
data access: master copy
updates can be executed using different strategies, resulting in different guarantees
immediately
• hold reading access when master copy is written
• can guarantee synchronicity
lazily
• e.g. when system load allows without blocking access
• copies can be in obsolete state
data access: master copy
• access-blocking needs centralized write access
• centralization leads to bottle-necks in efficiency and security guarantees like synchronicity can often only be given in exchange for efficiency:
synchronous systems with immediate update
• have block reading access on updates (→centralization)
• can guarantee synchronicity and consistency
asynchronous / lazy updates
• can still guarantee consistency
data access: without master copy
• writing is allowed to any duplicate
• synchronicity could be achieved by blocking duplicates
– blocking in a distributed situation as complicated – easily leads to locking situations
updates without blocking or master copy
• parallel writing is allowed
• two duplicates can be in different updated states
• neither synchronicity nor consistency guaranteed
• synchronization strategy needed to resolve inconsistencies
– duplicates in different states have to be reunified – example: latest update always wins
data access: transactions
abstraction of grouped updates• implement invariants in storage systems
• example: online store
– (1) outstanding debts, (2) bills to be send – (3) goods available, (4) goods to be send – when order is placed, ensure (1)-(4) are
• all updated (purchase successful, transaction completed) • not updated at all (purchase failed, transaction failed) • transactions group individual updates together
usually either all updates are executed or none (atomic)
• systemcan guarantee properties for transactions
e.g. ACID (atomic, consistent, isolated, durable) in RDBMS
data access: the CAP-theorem
formalize guarantees and network model:• consistency
– transactions, guarantee ACID-properties – allows to keep system in a consistent state
• availability
– data still accessible if nodes fail (e.g. backups)
• partition tolerance
– tolerate disconnection of the network – individual parts function independently – allows arbitrary scaling
• network model
– no global clock (asynchronous) – nodes decide on local information
data access: the CAP theorem
CAP theorem
Maximal two properties out of consistency, availability and partition tolerance can be guaranteed.
• conjectured by Brewer, 2000
• proven for an asynchronous network model3
• improvements are possible when nodes have a global timer (synchronous network model)
• keeping all nodes on single, global time is very complicated (if possible at all)
3Lynch, Gilbert,Brewer’s conjecture and the feasibility of consistent,
data access: trade-offs
• redundancy
– enables parallelism and failure tolerance – enforces updates in multiple places
• collides with
– efficiency
– synchronicity andconsistency
• systems differ in the subset of properties they provide
– optimal system for particular task result of required guarantees – distributed RDBMS systems: consistency, availability
– NOSQL4 systems: availability, partition tolerance
4not only SQL