• No results found

clash I (above) to occur The procedure follow ed in case clash 1 occurs assum es that the data path

Section 4.1.2, the location change requests issued as part o f the procedure for restoration on

i f node is acting as an old parent, if packet is a location change request flagged

4.3.2 N ode and Link Failure

In the more com plex situation where there is a real hardware failure in the network, the basic

m echanism is the sam e as for the case o f m anagement re-configuration discussed previously.

There are som e extra considerations, though. For the m anagement re-configuration case, it was

assumed that the link to the new parent is constructed before the disconnection is made, therefore

there are no tim e delays between disconnection and establishm ent o f the new link. For the

hardware failure case, this assumption is no longer valid. The network has first to detect a failure

has occurred, the parentless node(s) has(have) then to find a node capable o f accepting an extra

child and establish the new link so that the database restoration procedure can be initiated. If the delays involved in this procedure are significant, the data paths within the system w ill become

inaccurate as updates cannot be forwarded to the nodes involved in the hardware failure w hile they are not reconnected to a new parent.

It is important to note that, if a node realizes it can no longer com m unicate with either a parent or

a cliild, it cannot establish the extent o f the damage. The node cannot distinguish between a

sim ple link failure and a m assive hardware failure involving several nodes. All it can determine

is that the com m unication between itself and either a parent or a child has been disrupted.

Therefore the recovery m echanism has to be general for both link and node failures. Again the

problem is divided into two parts; the detection o f a hardware failure and the recovery m echanism

followed in order to reconnect the isolated nodes and restore the network databases.

4.3.2.1 D etection

The precise detection method w ill depend on w hich signalling protocol is used. For instance, if

the CCITT system No. 7 is used, link operability would be indicated by the constant transmission

o f packets even if there is no information to be sent (the fill-in signal unit) [113]. The tim e delay

necessary for a node failure to be detected is also dependent on the adopted sign alling protocol.

But basically, a node assum es a hardware failure has occurred if no packet is received from a

neighbouring node w ithin a pre-determined period o f time (this time interval depends on the

4.3.2.2 R ecovery M echanism

In the management re-configuration case, the old parent and the reconnected child keep a direct

link between them selves so that requests for information that has not yet been restored can be

transferred directly to reconnected child node and information can be restored on demand. This

direct link is disconnected as soon as the database restoration is finished. Therefore, we need a

means o f directly connecting the old parent and the reconnected child in case o f link failure and,

in case o f node failure, the parent and the children o f the failed node.

Either the old parent or the reconnected child (or children) can initiate the process for the

establishm ent o f the direct link. The process is best initiated by the reconnected child (or

children) because the old parent node cannot determine whether or not the child nodes have

already reconnected to a new parent and how many o f them there are. Hence, the first task

performed by the nodes that have lost connection to their parent is to tiy^ and find a new parent

node that can accept an extra child (i.e. the availability o f buffer space and the computation load on tlie prospective new parent need to be checked). The nodes can either have a list o f alternative

addresses o f prospective new parents or it can obtain this information in real time by querying its

neighbours.

N ew P aren t N ode S election P roced ure

Restrictions have to be placed on the selection o f a new parent node. The isolated children should not be allow ed to connect to a node under its own domain, as tliis would create loops in the

network topology. However, nodes are only aware o f the addresses o f their parent and im mediate

child nodes, a node does not keep information on its grandchildren, great grandchildren, etc...

And given the fact that the network topology is dynamic, the nodes under the isolated ch ild ’s

domain can vary. If the nodes keep a list o f alternative new parents, this list has to be dynamically

updated as reconfigurations occur, so that it conforms to the current network topology. The difficulty in keeping an up-to-date list o f prospective new parents is due to the fact that each node

operates independently and very little information is kept by the nodes about the current network

topology. One possible procedure that can be used in selecting a new parent is for the isolated

child to give preference or even restrict their choice to nodes that are at a hierarchical level equal

to or greater than its own. All nodes, then, need to keep information about their current

hierarchical level. Their initial level is set up when the system is first implemented. As

reconfigurations occur, the reconnected nodes update their hierarchical level according to their

new parent’s level, keeping this information up-to-date.

D irect L in k E stab lish m en t

Once the parentless node m anages to find a new parent that can accept it as a child, it initiates

the procedure for the establishm ent o f a direct link between itself and the old parent node or, in

p a re n t , h o w c x c r it c a n n o t as s u m e it is still w o rk in g , h ence it s tarts a Hood fill m e c h a n i s m in o rd e r

to be able to e stab lish c o m m u n ic a t io n s w ith e ith e r its old p a re n t or the p aren t o f its old p aren t

O n ly n odes that h a \ e lost c o m m u n ic a t io n w ith a child node res p o n d to the flood fill packet, nodes

no t in vo lv ed in a r e - c o n fig u r a tio n process sim ply re plic ate it.

T h e flood fill p acket m u s t carr> two addresses: th e a d d re s s o f th e old p aren t node an d th e a d d re s s

o f th e re c o n n e c te d c h ild node th at in itiated th e flood fill se arch. T he re fo r e , if a node, that h a s lost

a c h ild node, id entif ies the first a d d re s s e ith e r as its o w n or as th e ad d re s s o f th e lost child, it can

t h e n re s p o n d to th e flood fill se a r c h a n d estab lish a dir ect link w ith th e se c o n d a d d r e s s listed in

th e flood fill pack et. If no a n s w e r is o b ta in e d by th e r e c o n n e c t e d c h ild w ith in a ti m e - o u t period, it

re trie s a n u m b e r o f times , if still no a n s w e r is o btained, it a s s u m e s th e d a m a g e involves a la r g e r

p a r t o f th e n e tw o rk a n d th erefo re probably n eeds th e i n te rv e n t io n o f th e n e tw o rk opera tor.

H ow ever, e ven if th e dir ect link c a n n o t be established, th e re c o n n e c te d c h ild c a n p roceed w ith the

d a ta b a s e r e c o n s t m c t io n th r o u g h th e i m m e d ia te re s to ra tio n m ethod. F ig u r e s 4.9 a n d 4.1 0

e.NCinplify link a n d node failures, respectively.

old

parent direct link p a r e n t

reconnected clnkl^*.

lin k

I'ig u rc 4.9. tuntv failure. .Vtler isolated ctuld recoiiuects to new parent, a direct lint; is estalilislied betw een it.self and old parent node.

C o n s e q u e n c e s o f F a ilin g to E stablis h D ir ect L in k

A lth o u g h , th e direct link e s ta b lis h m e n t p r o c e d u r e sh o u ld not fail u n le s s m a jo r h a r d w a r e p ro b le m s

are pre s e n t in th e n e tw o rk (in w h ic h case e x te r n a l in t e rv e n tio n w o u ld be req u ired ), it is im p o rt a n t

to a n a ly s e th e c o n s e q u e n c e s o f not b e in g able to p e rf o rm d a ta b a s e re s to ra tio n on d e m a n d . If the

old p a r e n t (or p a r e n t o f failed unit) re ceives a call request d e s t in e d to o n e o f its d is c o n n e c te d

c h ild r e n , th e r e are no serious c o n s e q u e n c e s , it sim ply d r o p s it a n d th e call is not c o n n e c te d . If the

re quest is a lo catio n c h a n g e th e n th e old p a re n t (or p a r e n t o f failed unit) u p d a te s its d a ta b a s e an d

d ele tes th e packet. T h e c o n s e q u e n c e is that w h e n th e re c o n n e c te d c h ild finally issu es an

i m m e d i a t e re s to ra tio n packet for that e n tiy , it will rebuild th e old data path a n d delete th e one

p o in t in g to the c u s t o m e r ’s c u rr e n t loca tion. T h is situ a tio n will only be c o rr e c te d w h e n the

c u s t o m e r re-registers. A n o th e r c o n s e q u e n c e o f th e old p a re n t (or p a re n t o f failed unit), not h a v in g

b een able to c o n ta c t th e lost c h il d r e n a n d e sta b lish a direct link to th e m is th at it c a n n o t p e rf o rm

the test d e s c r ib e d in Section 4.1 .1 .4 o f c h e c k i n g if th e flag g ed packet w as g e n e r a te d by th e m It

re s p o n d s to a flagged location c h a n g e request d e stin e d to o ne its lost c h il d r e n by sim p ly u p d a tin g

Its d a ta b a s e a n d d e le t in g th e packet. If the flagged packet w a s part o f a n o th e r s i m u lta n e o u s re­

co n f i g u r a tio n p ro c e d u r e in th e sa m e portio n o f th e n etw ork, w h e n th e (lagged packet relative to

th e r e c o n f ig u ra tio n the old p aren t is involved in is finally issued for the s a m e entry, the

c o r r e s p o n d in g d a ta path will be left in c o n s iste n t T h e sy s tem is still ab le to deal w ith these

in cons istencies , a lth o u g h w ith a h ig h e r cost in te r m s o f n e tw o rk res ources. H ow ever, as

m e n tio n e d earlier, failu re to e stablish th e direct link s h o u ld be a ra r e o ccurrence.

parent of failed node direct links node failure reconnected new connections

Figure 4.10: N ode failure. Lsolated children reconnect to new parent.s and establish direct links to parent o f failed node.

T h e C a se o f M u lt ip l e D irect L in k s

It m ust be n o ted that, in case o f node failure, m o r e th a n o n e direct link will be estab lish ed , there

will be o ne link for each o f th e c h ild r e n o f th e failed node (sec fig u re 4 .1 0 ) In this m o r e general

case, th e p a r e n t o f the failed node will not be ab le to d e t e r m i n e to w h ic h re c o n n e c te d c h ild it

sho u ld se n d a re quest d e stin e d to th e failed node. O n e so lu tio n is s i m p ly to send th e request to all

direct links. T h e c h ild r e n that do not h ave th e re q u ired in f o r m a t io n ig n o r e th e packet. In the

c u rr e n t protocol, for th e m a n a g e m e n t re c o n f ig u ra t io n case, a re c o n n e c te d c h ild node a s s u m e s an

e r r o r has o c c u rr e d if it does not h ave th e in f o r m a tio n re q u ire d by a packet sent by its old p aren t

n ode via th e direct link. T h is p r o c e d u r e w o u ld h a \ e to be c h a n g e d so th at th e re c o n n e c te d child

node does not p e rf o rm e r r o r detec tio n for p ack ets c o m i n g fro m its old pare nt. H owever, if the

r e c o n n ected c h ild s u p p o s e d to h a v e th e re q u ired in f o r m a tio n h as lost it. it will fail to detect the

inconsistency. T h e c o n s e q u e n c e is that the d a ta p a th will r e m a in c o r r u p te d a n d call a n d location

c h a n g e reques ts will be rejected for th e d u r a tio n o f th e d a ta b a s e re s to ra tio n pro cedure. As so on as

th e d a ta b a s e re s to ra tio n p ro c e d u r e is over, th e in c o n s iste n t d a ta p a th s will be c le a re d w h e n the