1 1
Remus
Remus
: High Availability via
: High Availability via
Asynchronous Virtual Machine
Asynchronous Virtual Machine
Replication
Replication
Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Mike
Mike FeeleyFeeley, Norm Hutchinson, and Andrew Warfield, Norm Hutchinson, and Andrew Warfield
Department of Computer Science Department of Computer Science The University of British Columbia The University of British Columbia
NSDI
NSDI ’’08: 5th USENIX Symposium on Networked 08: 5th USENIX Symposium on Networked Systems Design and Implementation (2008)
Systems Design and Implementation (2008)
Presented by
Outline
Outline
Introduction
Introduction
Design and Implementation
Design and Implementation
Evaluation
Evaluation
3 3
Introduction
Introduction
It
It
’
’
s hard and expensive to design highly available
s hard and expensive to design highly available
system to survive hardware failure
system to survive hardware failure
–
– Using redundant component.Using redundant component. –
– Customized hardware and software.Customized hardware and software. –
– E.g. HP Nonstop Server.E.g. HP Nonstop Server.
Our goal is to provide high availability system, and
Our goal is to provide high availability system, and
it
it’
’
s :
s :
–
– Generality and transparencyGenerality and transparency
Without modification of OS and App.Without modification of OS and App.
–
Introduction
Introduction
Our approach
Our approach
–
–
VM
VM
-
-
based whole system replication
based whole system replication
Frequently checkpoint whole Virtual Machine state.Frequently checkpoint whole Virtual Machine state.
Protected VM and Backup VM is located in different Protected VM and Backup VM is located in different Physical host.
Physical host.
–
–
Speculative execution
Speculative execution
We buffer state to We buffer state to synsyn backup later, and continue backup later, and continue execution ahead of
execution ahead of synsyn point.point.
–
–
Asynchronous replication
Asynchronous replication
5 5
Introduction
Introduction
Design and Implementation
Design and Implementation
Our implementation is based on:
Our implementation is based on:
–
– XenXen virtual machine monitor.virtual machine monitor. –
– XenXen’’ss support for live migration to provide finesupport for live migration to provide fine--grained grained checkpoints.
checkpoints. –
– Two host machines is connected over redundant gigabit Two host machines is connected over redundant gigabit Ethernet connections.
Ethernet connections.
The virtual machine does not actually execute on
The virtual machine does not actually execute on
the backup host until a failure occurs.
7 7
Design and Implementation
Design and Implementation
Design and Implementation
Design and Implementation
Pipelined checkpoint (see figure 1,p
Pipelined checkpoint (see figure 1,p
-
-
5)
5)
1.Pause the running VM and copy any changed state into a 1.Pause the running VM and copy any changed state into a
buffer. (talk later more detail) buffer. (talk later more detail)
2.With state changes preserved in a buffer, VM is
2.With state changes preserved in a buffer, VM is unpausedunpaused and speculative execution resumes.
and speculative execution resumes.
3.Buffered state is transmitted to the backup host. 3.Buffered state is transmitted to the backup host. 4.When complete state has been received,
4.When complete state has been received, ackack to the to the primary.
9 9
Design and Implementation
Design and Implementation
Next we talk about how to checkpoint CPU state,
Next we talk about how to checkpoint CPU state,
memory, disk , network.
memory, disk , network.
CPU & memory state
CPU & memory state
–
– CheckpointingCheckpointing is implemented above is implemented above Xen’Xen’ss existing existing machinery for performing
machinery for performing live migrationlive migration..
What is live migration?
What is live migration?
–
– A technique by which a VM is relocated to another A technique by which a VM is relocated to another physical host with slight interruption.
Design and Implementation
Design and Implementation
How live migration work?
How live migration work?
1.Memory is copied to the new location
1.Memory is copied to the new location
while
while
the
the
VM continues to run at the old location.
VM continues to run at the old location.
2.During migration, writes to memory are
2.During migration, writes to memory are
intercepted, and dirty pages are copied to the
intercepted, and dirty pages are copied to the
new location in rounds.
new location in rounds.
3.After a specified number of intervals, the guest
3.After a specified number of intervals, the guest
11 11
Design and Implementation
Design and Implementation
By hardware MMU, page protection is used
By hardware MMU, page protection is used
to trap dirty page.
to trap dirty page.
Actually,
Actually,
Remus
Remus
implements
implements
checkpointing
checkpointing
as repeated executions of the final round of
as repeated executions of the final round of
live migration.
live migration.
Design and Implementation
Design and Implementation
Disk buffering
Disk buffering
–
–
Requirement
Requirement
All writes to disk in VM is configured to write though.All writes to disk in VM is configured to write though.
As a result, we can achieve crashAs a result, we can achieve crash--consistent when consistent when active and backup host both fail.
active and backup host both fail.
–
–
On
On
-
-
disk state don
disk state don
’
’
t change until the entire
t change until the entire
checkpoint has been received, see figure4 in
checkpoint has been received, see figure4 in
p13.
13 13
Design and Implementation
Design and Implementation
Design and Implementation
Design and Implementation
Network buffering
Network buffering
–
–
Most networks cannot be counted on for reliable
Most networks cannot be counted on for reliable
data delivery.
data delivery.
–
–
Therefore, networked app use reliable protocol
Therefore, networked app use reliable protocol
as TCP to deal with packet loss or duplication.
as TCP to deal with packet loss or duplication.
This fact simplifies the network buffering problem This fact simplifies the network buffering problem considerably : transmitted packets do not require considerably : transmitted packets do not require replication
15 15
Design and Implementation
Design and Implementation
In older to ensure packet transition atomic and
In older to ensure packet transition atomic and
checkpoint consistency:
checkpoint consistency:
–
– outbound packets generated since the previous outbound packets generated since the previous checkpoint are queued.
checkpoint are queued. –
– TilTil that checkpoint has been acknowledged by the that checkpoint has been acknowledged by the backup site, outbound packets is released.
Design and Implementation
Design and Implementation
Detecting Failure
Detecting Failure
–
–
We use a simple failure detector that is directly
We use a simple failure detector that is directly
integrated in the
integrated in the
checkpointing
checkpointing
stream:
stream:
a timeout of the backup responding to commit a timeout of the backup responding to commit requests.
requests.
a timeout of new checkpoints being transmitted from a timeout of new checkpoints being transmitted from the primary.
the primary.
–
17 17
Evaluation
Evaluation
Evaluation
Evaluation
19 19