Consistency - Issues in Distributed System

CHAPTER 3. MANAGEMENT FRAMEWORK

3.3. Issues in Distributed System

3.3.1. Consistency

Distributed systems are faced with many consistency issues such as duplicate requests, messages arriving out of order and multiple instances of resources leading to inconsistency. Specifically, this raises a number of consistency issues in our system such as:

1. Two or more managers managing the same resource;

2. Old messages reaching after newer messages;

3. Multiple copies of resources existing at same time (Orphaned resources). This is true when resources are software services and the management framework spawns a new instance of the service to account for a possible failure of the old instance and

4. Multiple system health check routines spawned by the bootstrap service may see incoherent system state exacerbating the consistency of the system state.

To address these issues we impose the following consistency check scheme:

i. We rely on the registry to generate a unique Instance ID (IID) per instance of resource or manager thread created. This ID could be an NTP timestamp or a simple sequence number that is guaranteed to be unique and monotonically increasing. Further, we assume that when the registry generates this number, all replicas of the registry have the same view of the number.

ii. Every time a new resource requires management, it needs to register itself with the registry. Further, each service adapter periodically registers its presence in the registry. This facility is required to determine duplicate resources (in the case when the management framework can create new instances of resources such as by means of spawning processes). The resource’s service adapter automatically gets its instance id when it registers in the registry and is returned via the registration response. Also, during renewals, the registry simply returns the current known instance id.

iii. A resource-specific manager thread also obtains its unique Instance ID when it is assigned a resource to manage. This unique id is used by the resource-specific manager to construct a unique Message ID to be used for every message sent from that entity. This Message ID is a combination of the sender’s Instance ID and a monotonically increasing sequence number. Retries of requests use the same Message ID rather than generating a new Message ID. This allows the resource to discard duplicates.

We now discuss, how the above inconsistencies may be resolved using these restrictions:

1. If a manager process is considered dead / unreachable due to a missed / delayed heartbeat, the health check spawns a new manager process to take over the responsibility of managing the resources which were being managed by the previous manager. If the

old manager tries to invoke a management operation on the resource, the service adapter looks at the message and can disregard the old manager’s request. Thus, a request coming from a manager with Instance ID A (IIDA) is considered by the resource’s service

adapter to be obsolete if the resource is currently being managed by a manager with Instance ID (IIDB) and if IIDA < IIDB.

2. By keeping track of the last known successfully processed message’s message ID, duplicates and obsolete messages may be discovered and discarded.

3. In some cases, a user-defined policy may cause new instances of resources to be spawned when an old resource is deemed unreachable. In such cases, if the old resource comes back up, we get multiple duplicate resources existing at the same time. This may lead to application specific inconsistencies.

We assume that in such a case, if a user defined policy states that a new resource can be instantiated, a user defined policy also exists on how to deal with multiple copies of resource. Given this assumption, duplicate resources may be detected and inconsistencies may be resolved as follows:

a. Consider a resource-specific managers M and an instance R1 of resource R.

Assume, M is managing R1 and at some point concludes that R1 is unreachable.

M then instantiates R2 by virtue of a user-defined policy.

b. Soon after, R1 comes back up and sends heartbeat / event to manager M. In the

current implementation, M discards this heartbeat / event. Hence, it will no longer manage R1.

c. Further, R1 periodically renews itself in registry also. The registry response is

simply the current known instance id of the resource R. When R1sees that R2

4. To prevent multiple health checks from running at the same time and introducing inconsistencies in the system, we implement the following policy:

a. Health check always runs for a pre-determined time interval x, during which it either reports success back to the bootstrap node OR self-terminates, if it cannot successfully bring up failed components of the management framework. Further, it also sends periodic heartbeats to the bootstrap service to note that it is alive and running.

b. If the health check routine becomes unreachable from the bootstrap service and is unable to deliver heartbeats in timely fashion, the bootstrap node may conclude that the health check routine is dead and spawn a new health check process. This may introduce race conditions and inconsistencies if the old health check process is not actually dead.

c. To prevent this, the bootstrap service will always wait a little over the time interval of health check x, before re-spawning a new health check routine, unless the existing health check routine has reported success. Thus, even if the previous health check is unable to deliver heartbeats, it would have self-terminated after its interval x.

As a final note, the auto instantiation scheme presented above poses a problem when systems get partitioned and managers in each partition spawn duplicate processes to compensate for all missing processes. As an illustration, if one partition (A) contains about 97% of resources, while the other partition (B) contains 3% of resources, the managers can end up building an extra 97% of missing resources on partition B. This problem would become significantly complex if there were 3 partitions containing 98%, 1% and 1% of resources. While we consider this situation as being out of scope of our current work, we imagine specification of appropriate mechanisms and policies to handle such situations.

As an illustration, a failure-recovery mechanism could state that spawning processes to compensate for missing resources should be done only if the missing resources do not exceed y%. A user-defined policy would set y = 3, thus ensuring that managers do not spawn more than a fixed number of duplicate resources.

In document SCALABLE, FAULT TOLERANT MANAGEMENT OF GRID SERVICES: APPLICATION TO MESSAGING MIDDLEWARE (Page 55-59)