clash I (above) to occur The procedure follow ed in case clash 1 occurs assum es that the data path
Section 4.1.2, the location change requests issued as part o f the procedure for restoration on
i f node is acting as an old parent, if packet is a location change request flagged
4.3.2 N ode and Link Failure
In the more com plex situation where there is a real hardware failure in the network, the basic
m echanism is the sam e as for the case o f m anagement re-configuration discussed previously.
There are som e extra considerations, though. For the m anagement re-configuration case, it was
assumed that the link to the new parent is constructed before the disconnection is made, therefore
there are no tim e delays between disconnection and establishm ent o f the new link. For the
hardware failure case, this assumption is no longer valid. The network has first to detect a failure
has occurred, the parentless node(s) has(have) then to find a node capable o f accepting an extra
child and establish the new link so that the database restoration procedure can be initiated. If the delays involved in this procedure are significant, the data paths within the system w ill become
inaccurate as updates cannot be forwarded to the nodes involved in the hardware failure w hile they are not reconnected to a new parent.
It is important to note that, if a node realizes it can no longer com m unicate with either a parent or
a cliild, it cannot establish the extent o f the damage. The node cannot distinguish between a
sim ple link failure and a m assive hardware failure involving several nodes. All it can determine
is that the com m unication between itself and either a parent or a child has been disrupted.
Therefore the recovery m echanism has to be general for both link and node failures. Again the
problem is divided into two parts; the detection o f a hardware failure and the recovery m echanism
followed in order to reconnect the isolated nodes and restore the network databases.
4.3.2.1 D etection
The precise detection method w ill depend on w hich signalling protocol is used. For instance, if
the CCITT system No. 7 is used, link operability would be indicated by the constant transmission
o f packets even if there is no information to be sent (the fill-in signal unit) [113]. The tim e delay
necessary for a node failure to be detected is also dependent on the adopted sign alling protocol.
But basically, a node assum es a hardware failure has occurred if no packet is received from a
neighbouring node w ithin a pre-determined period o f time (this time interval depends on the
4.3.2.2 R ecovery M echanism
In the management re-configuration case, the old parent and the reconnected child keep a direct
link between them selves so that requests for information that has not yet been restored can be
transferred directly to reconnected child node and information can be restored on demand. This
direct link is disconnected as soon as the database restoration is finished. Therefore, we need a
means o f directly connecting the old parent and the reconnected child in case o f link failure and,
in case o f node failure, the parent and the children o f the failed node.
Either the old parent or the reconnected child (or children) can initiate the process for the
establishm ent o f the direct link. The process is best initiated by the reconnected child (or
children) because the old parent node cannot determine whether or not the child nodes have
already reconnected to a new parent and how many o f them there are. Hence, the first task
performed by the nodes that have lost connection to their parent is to tiy^ and find a new parent
node that can accept an extra child (i.e. the availability o f buffer space and the computation load on tlie prospective new parent need to be checked). The nodes can either have a list o f alternative
addresses o f prospective new parents or it can obtain this information in real time by querying its
neighbours.
N ew P aren t N ode S election P roced ure
Restrictions have to be placed on the selection o f a new parent node. The isolated children should not be allow ed to connect to a node under its own domain, as tliis would create loops in the
network topology. However, nodes are only aware o f the addresses o f their parent and im mediate
child nodes, a node does not keep information on its grandchildren, great grandchildren, etc...
And given the fact that the network topology is dynamic, the nodes under the isolated ch ild ’s
domain can vary. If the nodes keep a list o f alternative new parents, this list has to be dynamically
updated as reconfigurations occur, so that it conforms to the current network topology. The difficulty in keeping an up-to-date list o f prospective new parents is due to the fact that each node
operates independently and very little information is kept by the nodes about the current network
topology. One possible procedure that can be used in selecting a new parent is for the isolated
child to give preference or even restrict their choice to nodes that are at a hierarchical level equal
to or greater than its own. All nodes, then, need to keep information about their current
hierarchical level. Their initial level is set up when the system is first implemented. As
reconfigurations occur, the reconnected nodes update their hierarchical level according to their
new parent’s level, keeping this information up-to-date.
D irect L in k E stab lish m en t
Once the parentless node m anages to find a new parent that can accept it as a child, it initiates
the procedure for the establishm ent o f a direct link between itself and the old parent node or, in
p a re n t , h o w c x c r it c a n n o t as s u m e it is still w o rk in g , h ence it s tarts a Hood fill m e c h a n i s m in o rd e r
to be able to e stab lish c o m m u n ic a t io n s w ith e ith e r its old p a re n t or the p aren t o f its old p aren t
O n ly n odes that h a \ e lost c o m m u n ic a t io n w ith a child node res p o n d to the flood fill packet, nodes
no t in vo lv ed in a r e - c o n fig u r a tio n process sim ply re plic ate it.
T h e flood fill p acket m u s t carr> two addresses: th e a d d re s s o f th e old p aren t node an d th e a d d re s s
o f th e re c o n n e c te d c h ild node th at in itiated th e flood fill se arch. T he re fo r e , if a node, that h a s lost
a c h ild node, id entif ies the first a d d re s s e ith e r as its o w n or as th e ad d re s s o f th e lost child, it can
t h e n re s p o n d to th e flood fill se a r c h a n d estab lish a dir ect link w ith th e se c o n d a d d r e s s listed in
th e flood fill pack et. If no a n s w e r is o b ta in e d by th e r e c o n n e c t e d c h ild w ith in a ti m e - o u t period, it
re trie s a n u m b e r o f times , if still no a n s w e r is o btained, it a s s u m e s th e d a m a g e involves a la r g e r
p a r t o f th e n e tw o rk a n d th erefo re probably n eeds th e i n te rv e n t io n o f th e n e tw o rk opera tor.
H ow ever, e ven if th e dir ect link c a n n o t be established, th e re c o n n e c te d c h ild c a n p roceed w ith the
d a ta b a s e r e c o n s t m c t io n th r o u g h th e i m m e d ia te re s to ra tio n m ethod. F ig u r e s 4.9 a n d 4.1 0
e.NCinplify link a n d node failures, respectively.
old
parent direct link p a r e n t
reconnected clnkl^*.
lin k
I'ig u rc 4.9. tuntv failure. .Vtler isolated ctuld recoiiuects to new parent, a direct lint; is estalilislied betw een it.self and old parent node.
C o n s e q u e n c e s o f F a ilin g to E stablis h D ir ect L in k
A lth o u g h , th e direct link e s ta b lis h m e n t p r o c e d u r e sh o u ld not fail u n le s s m a jo r h a r d w a r e p ro b le m s
are pre s e n t in th e n e tw o rk (in w h ic h case e x te r n a l in t e rv e n tio n w o u ld be req u ired ), it is im p o rt a n t
to a n a ly s e th e c o n s e q u e n c e s o f not b e in g able to p e rf o rm d a ta b a s e re s to ra tio n on d e m a n d . If the
old p a r e n t (or p a r e n t o f failed unit) re ceives a call request d e s t in e d to o n e o f its d is c o n n e c te d
c h ild r e n , th e r e are no serious c o n s e q u e n c e s , it sim ply d r o p s it a n d th e call is not c o n n e c te d . If the
re quest is a lo catio n c h a n g e th e n th e old p a re n t (or p a r e n t o f failed unit) u p d a te s its d a ta b a s e an d
d ele tes th e packet. T h e c o n s e q u e n c e is that w h e n th e re c o n n e c te d c h ild finally issu es an
i m m e d i a t e re s to ra tio n packet for that e n tiy , it will rebuild th e old data path a n d delete th e one
p o in t in g to the c u s t o m e r ’s c u rr e n t loca tion. T h is situ a tio n will only be c o rr e c te d w h e n the
c u s t o m e r re-registers. A n o th e r c o n s e q u e n c e o f th e old p a re n t (or p a re n t o f failed unit), not h a v in g
b een able to c o n ta c t th e lost c h il d r e n a n d e sta b lish a direct link to th e m is th at it c a n n o t p e rf o rm
the test d e s c r ib e d in Section 4.1 .1 .4 o f c h e c k i n g if th e flag g ed packet w as g e n e r a te d by th e m It
re s p o n d s to a flagged location c h a n g e request d e stin e d to o ne its lost c h il d r e n by sim p ly u p d a tin g
Its d a ta b a s e a n d d e le t in g th e packet. If the flagged packet w a s part o f a n o th e r s i m u lta n e o u s re
co n f i g u r a tio n p ro c e d u r e in th e sa m e portio n o f th e n etw ork, w h e n th e (lagged packet relative to
th e r e c o n f ig u ra tio n the old p aren t is involved in is finally issued for the s a m e entry, the
c o r r e s p o n d in g d a ta path will be left in c o n s iste n t T h e sy s tem is still ab le to deal w ith these
in cons istencies , a lth o u g h w ith a h ig h e r cost in te r m s o f n e tw o rk res ources. H ow ever, as
m e n tio n e d earlier, failu re to e stablish th e direct link s h o u ld be a ra r e o ccurrence.
parent of failed node direct links node failure reconnected new connections
Figure 4.10: N ode failure. Lsolated children reconnect to new parent.s and establish direct links to parent o f failed node.
T h e C a se o f M u lt ip l e D irect L in k s
It m ust be n o ted that, in case o f node failure, m o r e th a n o n e direct link will be estab lish ed , there
will be o ne link for each o f th e c h ild r e n o f th e failed node (sec fig u re 4 .1 0 ) In this m o r e general
case, th e p a r e n t o f the failed node will not be ab le to d e t e r m i n e to w h ic h re c o n n e c te d c h ild it
sho u ld se n d a re quest d e stin e d to th e failed node. O n e so lu tio n is s i m p ly to send th e request to all
direct links. T h e c h ild r e n that do not h ave th e re q u ired in f o r m a t io n ig n o r e th e packet. In the
c u rr e n t protocol, for th e m a n a g e m e n t re c o n f ig u ra t io n case, a re c o n n e c te d c h ild node a s s u m e s an
e r r o r has o c c u rr e d if it does not h ave th e in f o r m a tio n re q u ire d by a packet sent by its old p aren t
n ode via th e direct link. T h is p r o c e d u r e w o u ld h a \ e to be c h a n g e d so th at th e re c o n n e c te d child
node does not p e rf o rm e r r o r detec tio n for p ack ets c o m i n g fro m its old pare nt. H owever, if the
r e c o n n ected c h ild s u p p o s e d to h a v e th e re q u ired in f o r m a tio n h as lost it. it will fail to detect the
inconsistency. T h e c o n s e q u e n c e is that the d a ta p a th will r e m a in c o r r u p te d a n d call a n d location
c h a n g e reques ts will be rejected for th e d u r a tio n o f th e d a ta b a s e re s to ra tio n pro cedure. As so on as
th e d a ta b a s e re s to ra tio n p ro c e d u r e is over, th e in c o n s iste n t d a ta p a th s will be c le a re d w h e n the