Clock Synchronization
Introduction
In a single CPU system, critical regions, mutual exclusion and
other synchronization problems are generally solved using
methods such as semaphores and monitors and highly rely
on shared memory.
Communication not enough. Need cooperation
Synchronization
Distributed synchronization needed for
– Transactions (bank account via ATM)
– Access to shared resource (network printer)
Clock Synchronization
In a centralized system, time is unambiguous.
A process will make a system call and the kernel will tell
it the time.
Consider make (on a NFS mounted system)
Compiling machine compares time stamps
Can we set all clocks in a distributed system to have the
same time?
Physical Clocks
Every computer has a local clock --
a timer
is more
appropriate.
A timer is usually a precisely machined
quartz crystal
. When
kept under tension, quartz crystals oscillate at a well-defined
frequency that depends on the kind of crystal, how it is cut,
and the amount of tension.
Associated with each crystal are two registers,
a counter
and
a holding register
.
Physical Clocks
In a distributed system, there is no way to guarantee that the
crystals in different computer all run at exactly the same
frequency.
Crystals running at different rates will results in clocks
gradually out of sync and give different value when read out.
This differences in time value is called
clock skew
Solution: Universal Coordinated Time (UTC):
– Formerly called GMT– Based on the number of transitions per second of the Cesium 133 atom – At present, the real time is taken as the average of some 50 cesium
clocks around the world-International Atomic Time (TAI) – UTC is broadcast through shortwave radio and satellite
Physical Clocks
Solar Time
– 1 sec = 1 day / 86400
(Take “noon”(transit of the sun) for two days(a solar day), divide by 24*60*60)
– Problem: days are of different lengths (due to tidal friction, etc.)
Mean solar second : averaged over many days
Problem:
Atomic clocks
do not keep in step with
solar
time
Not every machine has UTC receiver
– If one, then keep others synchronized
Basic Principle:
– Every machine has a timer that generates an interrupt H (typically 60) times per second
– There is a clock in machine p that ticks every timer interrupt. Denote by Cp(t), where t is UTC time
– Ideally we would have Cp(t)=t or dC/dt=1
– Theoretically, if H=60, 216,000 ticks per hour (dC/dt = 1)
– In practice, typical errors(due to timer chips), 10–5, so 215,998 to
216,002 ticks per second
7
Physical Clocks
If constant such that 1-
≤ dC/dt ≤ 1+
, the timer can be said
to be working within its specification.
The constant is specified by manufacturer and is known as
maximum drift rate
If two clocks drifts from UTC, at a time
t seconds, they will be at
most 2t apart
Clock Synchronization Principles
Principle I
: Passive Time Server
Every machine asks a time server for the accurate time at
least once every
/2
seconds.
But you need an accurate measure of round trip delay,
including interrupt handling and processing incoming
messages.
Principle II
:
Active Time Server
Cristian's Algorithm -
Based on Passive Time Server
Well suited with systems in which one machine (time
server) has a UTC receiver
Every
/2
, ask server for time
What are the problems?
Major
– Client clock is fast (UTC will be smaller than the client clock value)
– What to do?
Minor
Cristian's Algorithm
To estimate network delay, (T1-T0)/2
If we know I( Time server’s interrupt handling time), we can
estimate the n/w delay to be (T1-T0-I)/2.
Steps:
1) Client sends request at T02) Server replies with the current clock value- Tserver 3) Client receives response at T1
4) Client sets its clock to Tclient = Tserver+ T1-To/2
The Berkeley Algorithm
The time daemon asks all the other machines for their clock values
The machines answer
Averaging Algorithms
Both Cristian’s and Berkeley’s methods are highly centralized, with the usual disadvantages - single point of failure, congestion around the server, … etc.
One class of decentralized clock synchronization algorithms works by dividing time into fixed-length re-synchronization intervals.
The ith interval starts at T0 + iR and runs until T0 + (i+1)R, where T0 is an agreed upon moment in the past, and R is a system parameter.
At the beginning of each interval, every machine broadcasts the current time according to its clock.
After a machine broadcasts its time, it starts a local timer to collect all other broadcasts that arrive during some interval S.
Some algorithms:
– average out the time.
– discard the m highest and m lowest and average the rest -- this is to prevent up to m faulty clocks sending out nonsense
– correct each message by adding to it an estimate propagation time from the source.
Logical Clocks
Lamport (1978) showed that clock synchronization is possible and also pointed out that clock synchronization need not be absolute.
If two processes do not interact, it is not necessary that their clocks be synchronized.
What usually matters is not that all processes agree on exactly what time it is, but rather, that they agree on the order in which event occur.
Therefore, it is the internal consistency of the clocks that matters, not whether they are particularly close to the real time. It is conventional to speak of the clocks as logical clocks.
Lamport’s Algorithm
Lamport defined a relation called happens-before
The expression a->b is read “a happens before b” and means that all processes agree that the first event a occurs, then afterward, event b occurs.
The happens-before relation can be observed in two situation:
– If a and b are events in the same process, and event a occurs before event b, then a -> b is true.
Lamport Timestamps
Often don’t need time, but ordering
a
b
(
happens
before
)
Each processes with own clock with different rates.
Lamport's algorithm corrects the clocks.
Vector Clocks
(1, 0 , 0)
(2, 0, 0)
(3, 5, 2)
e11
e12
e13
(0, 1, 0)
(2, 2, 0) (2, 3, 1)
(2, 5, 2)
(0, 0, 1)
(0, 0, 2)
e21
e22
e23 e24
e31
e32
P1
P2
P3
(2,4,2)
e25
Each process i maintains a vector Vi
Vi[i] : number of events that have occurred at i
Vi[j] : number of events i knows have occurred at process j
Less than or equal: ts(a)≤ts(b) if ts(a)[i] is ≤ ts(b)[i] for all i. For eg [ (3,3,5) ≤ (3,4,5) ]
ts(e11) = (1, 0, 0) and ts(e22) = (2, 2, 0), which shows e11 e22
Causal Delivery :
if m is sent by P1 and ts(m) is (3, 4, 0) and you are P3, you should