• No results found

A few observations. The election protocol. A few observations. A few observations

N/A
N/A
Protected

Academic year: 2021

Share "A few observations. The election protocol. A few observations. A few observations"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

The election protocol

Processes agree on linear ordering (e.g. by pid) Each maintains set of all processes that believes to be operational

When detects failure of , it removes from .

!!! and chooses smallest in to be new coordinator

If = , then is new coordinator Otherwise, sends UR-ELECTED to

p UP

p

q

p

UPp

p c c

q

q UPp

p p

p

A few observations

What if , which has not detected the failure of c, receives a STATE-REQ from ?

p!

q

A few observations

What if , which has not detected the failure of , receives a STATE-REQ from ?

it concludes that must be faulty it removes from every

c

p!

UPp!

q q! < q c

A few observations

What if , which has not detected the failure of , receives a STATE-REQ from ?

it concludes that must be faulty it removes from every

What if receives a STATE-REQ from after it has changed the coordinator to ?

c

p!

UPp!

q

p! c

q! < q c

q

(2)

A few observations

What if , which has not detected the failure of , receives a STATE-REQ from ?

it concludes that must be faulty it removes from every

What if receives a STATE-REQ from after it has changed the coordinator to ?

ignores the request c

p!

UPp!

q

p! c

q! < q c

q p!

Total failure

Suppose is the first process to recover, and that is uncertain

Can decide ABORT?

Some processes could have decided COMMIT after crashed!

p p p

p

Total failure

Suppose is the first process to recover, and that is uncertain

Can decide ABORT?

Some processes could have decided COMMIT after crashed!

is blocked until some recovers s.t. either can recover independently

is the last process to fail–then can simply invoke the termination protocol

p p p

p

p

q

q q

q

Determining the last process to fail

Suppose a set of processes has recovered Does contain the last process to fail?

R R

(3)

Determining the last process to fail

Suppose a set of processes has recovered Does contain the last process to fail?

the last process to fail is in the set of every process

so the last process to fail must be in R

R

UP

!

p∈R

UPp

Determining the last process to fail

Suppose a set of processes has recovered Does contain the last process to fail?

the last process to fail is in the set of every process

so the last process to fail must be in

contains the last process to fail if R

R

UP

!

p∈R

UPp

!

p∈R

UPp⊆ R R

(4)

Introduction to fault tolerance

Availability

The availability of a system is the probability that the system will perform its required action

Availability

The availability of a system is the probability that the system will perform its required action

Example:

one workstation crashes once a week

takes 2 minutes to recover

Availability

The availability of a system is the probability that the system will perform its required action

Example:

one workstation crashes once a week

takes 2 minutes to recover Availability:

1− pcrash = 1 − 2 · 10−4 = 0.9998

(5)

Availability

The availability of a system is the probability that the system will perform its required action

Example:

one workstations crash once a week

take 2 minutes to recover Availability:

30

Availability

The availability of a system is the probability that the system will perform its required action

Example:

one workstations crash once a week

take 2 minutes to recover Availability:

30

(1 − pcrash)30 ≈ 1 − n · pcrash = 0.994

Increasing availability

Avoid a single point of failure

use replication (in time, or space) replicas communicate through narrow interface (e.g. send/receive)

Increasing availability

Avoid a single point of failure

use replication (in time, or space) replicas communicate through narrow interface (e.g. send/receive)

Example

Replicate the workstation 30 times Availability:

(6)

Increasing availability

Avoid a single point of failure

use replication (in time, or space) replicas communicate through narrow interface (e.g. send/receive)

Example

Replicate the workstation 30 times Availability:

1− pncrash = 1 − (2 · 10−4)30 = 1− 10−111

Modeling faults

Mean Time To Failure/ Mean Time To Recover close to hardware

Threshold: out of

makes condition for correct operation explicit

measures fault-tolerance of architecture, not single components

Set of explicit failure scenarios

f n

A hierarchy of failure models

Crash

A hierarchy of failure models

Crash Fail-stop

(7)

A hierarchy of failure models

Crash

Send Omission Receive Omission

Fail-stop

A hierarchy of failure models

Crash

Send Omission

General Omission

Receive Omission

benign failures

Fail-stop

A hierarchy of failure models

Crash

Arbitrary failures with message authentication Send Omission

General Omission

Receive Omission

benign failures

Fail-stop

A hierarchy of failure models

Crash

Arbitrary failures with message authentication Arbitrary (Byzantine) failures Send Omission

General Omission

Receive Omission

benign failures

Fail-stop

(8)

Replication in space

Run parallel copies of a unit Vote on replica output

Failures are masked

High availability, but at high cost

Replication in time

When a replica fails, restart it (or replace it) Failures are detected, not masked

Lower maintenance, lower availability Tolerates only benign failures

Better than you think...

Non-determinism

An event is non-deterministic if the state that it produces is not uniquely determined by the state in which it is executed

Handling non-deterministic events at different replicas is challenging

Replication in time requires to reproduce during recovery the original outcome of all non-deterministic events

Replication in space requires each replica to handle non-deterministic events identically

t

Primary-Backup

(9)

The Idea

Clients communicate with a single replica !

! (the primary)

The primary updates the other replicas (backups) Backups detect the failure of the primary using a timeout mechanism,

Clients fail over to a backup

Note: Non-deterministic events are executed only

! at the primary

Terminology

The failover time of a primary-backup service is the longest time the service can be without a primary

The service has a server outage at if some correct client sends a request at time to the service, but does not receive a response

A (k,Δ)-bofo service is one in which all

server outages can be grouped into at most k intervals of time, each of at most length Δ

t t

PB: A specification

(Budhiraja, Marzullo, Schneider, Toueg)

PB1: There exists a local predicate on the state of each server . At any time, there is at most one server whose state satisfies

PB2: Each client maintains a server identity such that to make a request, client . sends a message to

PB4: There exist fixed values k and Δ such that the service behaves like a single (k,Δ)-bofo server

PB3: If a client request arrives at a server that is not the current primary, then that request is not enqueued (and therefore is not processed)

Desti

Desti i i

P rmys

P rmys

s

s

A simple example:

system model

point-to-point communication non-faulty channels

upper bound δ on message delivery time at most one server crashes

(10)

A simple example:

system model

point-to-point communication non-faulty channels

upper bound δ on message delivery time at most one server crashes

Two processes:

the primary the backup

p1 p2

A simple example:

‘s protocol

On receipt of a client request, process consumes request and updates its state send state update message to

sends response to client without waiting for ack from

sends heartbeat message to every seconds

p 1

p1

p1 p2

p2

p2 τ

A simple example:

’s protocol

Upon receiving a state update from , updates its state

If does not receive a heartbeat for seconds,

declares itself primary it informs the clients

it begins consuming subsequent requests from clients

p 2

p1 p2 τ + δ p2

p2

It meets the spec...

p1 p2

! c

1

2 3

4

Failover: Time during which PB1: Prmyp

1 ∧Prmyp

2 = false

¬Prmyp

1 ∧ ¬Prmyp

2

Prmyp

1 ≡ p1 has not crashed Prmyp

2 ≡ p2 has not received a message from for p1 τ + δ seconds

"

(11)

It meets the spec...

p1 p2

!

"

c

1

2 3

4

Failover: Time during which PB1: Prmyp

1 ∧Prmyp

2 = false

¬Prmyp

1 ∧ ¬Prmyp

2

Prmyp

1 ≡ p1 has not crashed Prmyp

2 ≡ p2 has not received a message from for p1 τ + δ seconds

It meets the spec...

p1 p2

! "

! c

1

2 3

4

Failover: Time during which PB1: Prmyp

1 ∧Prmyp

2 = false

¬Prmyp

1 ∧ ¬Prmyp

2

Prmyp

1 ≡ p1 has not crashed Prmyp

2 ≡ p2 has not received a message from for p1 τ + δ seconds

"

It meets the spec...

p1 p2

" ! "

! c

1

2 3

4

Failover: Time during which PB1: Prmyp

1 ∧Prmyp

2 = false

¬Prmyp

1 ∧ ¬Prmyp

2

Prmyp

1 ≡ p1 has not crashed Prmyp

2 ≡ p2 has not received a message from for p1 τ + δ seconds

"

It meets the spec...

p1 p2

" ! "

! c

1

2 3

4

Failover: Time during which PB1: Prmyp

1 ∧Prmyp

2 = false

¬Prmyp

1 ∧ ¬Prmyp

2

Prmyp

1 ≡ p1 has not crashed Prmyp

2 ≡ p2 has not received a message from for p1 τ + δ seconds

τ + 2δ

"

(12)

…indeed, it does!

PB2, PB3: Follow immediately from protocol

PB4: Find k, # to implement (k,#)-bofo server

! c

1

2 3

4

"

…indeed, it does!

PB2, PB3: Follow immediately from protocol

PB4: Find k, # to implement (k,#)-bofo server

! c

1

2 3

4

• k = 1 (since at most one crash)

• Δ = longest interval during which a request elicits no response – assume crashes at

– any client request sent to at time

! or later may be lost

– may not become the new primary

! until

– client may not learn that is new primary for another

p1

p1

p2

p2 tc+ τ + 2δ

tcδ

tc

δ

∆ = τ + 4δ

"

Some like it hot

Hot Backups process information from the primary as soon as they receive it

Cold Backups log information received from primary, and process it only if primary fails

Rollback Recovery implements cold backups cheaply:

the primary logs directly to stable storage the information needed by backups

if the primary crashes, a newly initialized process is given content of logs—backups are generated “on demand”

References

Related documents

Giovanni Maria Bresciani, born in 1987, studied at Accademia di Architettura di Mendriso, Swit- zerland, where he studied under Mario Botta, Michele Arnaboldi, Burkhalter Sumi,

We propose to model such behavior using variants of smooth transition autoregressive (STAR) models which allow the interest rate threshold to evolve over time as a function of

Multiple regression analyses using state, city, and metropolitan data revealed a statistically significant or nearly significant relationship of melanoma incidence against

The findings suggest that firms can improve market access and growth through guanxi networks, but managers need to capitalize on them from the personal to the corporate level..

Similarly, in attempt to investigate the relation between working capital management and profitability, Racheman and Nasr (2007) selected a sample of 94 Pakistani firms listed

candidates or of the program, two critical requirements of the state administrative code. The most serious concern noted by the team was the lack of responsibility on the part of

The Scrum Product Owner uses the Scrum Product Backlog during the Sprint Planning Meeting to describe the top entries to the team.. The Scrum Team then determines which items they can