manager require no messages.
KEY:
D
0RESOURCE BLOCK
DIRECTORY ENTRY FOR RESOURCE (IMPLEMENTED AS A RESOURCE BLOCK) LOCK BLOCK
Figure 6 Conversion Request on a Node that Is Not the Resource Manager
These characteristics of the distributed lock manager ( i . e . , total space and message traffic behavior that is subject to a linear bound i n the "workload ") are a significant fact or in allowing VA.Xcluster systems to act as distributed operat ing systems . These characteristics suggest that, from the distributed lock manager's viewpoint, additional growth in the size of a VAXcluster con figurations is certainly viable.
Performance Aspects of the Distribu ted Lock Manager
Table 5 sum marizes the performance of the d is tributed lock manager. The measurements reflect operations that are norma lly done in pairs . Such
Digital TeciJnical journal No. 5 Sepl<!mher 1987
operations include an SENQ fol lowed by a S DEQ, and a conversion to a more restrictive mode (up) foUowed by a conversion to a less restrictive mode (down) . The operations reported in the table are performed on sublocks.
When Processors join or Leave the VAXcluster System
The connection manager plays a major role in the lock manager's abi lity to deal with configuration changes when one or more nodes join or leave the VAXcluster system. When the membership of the cluster must be altered, a coordinator node is e lected to lead the other nodes through the state transition . Any node can become the coordinator
3 9
VAXcluster Systems
The VAXjVMS Distributed Lock Manager
Table 4 Summary of Number of Messages Used for Lock Requests Request Type
I n itial root-lock request from a system for a previously unknown resource (i.e., no manager exists)
Messages
2 or
0
Subsequent root-lock requests on
0
resource manager
Sublock request on resource manager
0
U nlock request on resource manager with locks remaining
U nlock of last lock on resource by resource manager
Initial root-lock request from a system for a resource that is known (i.e., a manager exists)
Sublock requests and subsequent root-lock requests from a system that is not resource manager
U n lock request from a system that is not the resource manager
1 or
0
2 or 4 (1 )
2 ( 1 )
1 or 2
Comments
Zero messages if node making the request is the di rectory node. Otherwise two messages; a di rectory lookup req uest followed by a "do local" response.
Remove directory entry message sent to d i rectory node. No message sent if manager is also di rectory node.
If requester is the d irectory node, two messages consisting of a lock request followed by a response from the manager. If requester is not directory node, do a d i rectory lookup, a resend to manager response, a lock request to the manager, and a response back.
Lock request to manager and a response back.
Dequeue message to manager. Manager may then send a remove directory message to d i rectory node if this lock is the last one.
NOTE: I f the lock request cannot b e granted immediately, add one message. I f the lock is granted, blocking another request, and a blocking AST was requested, add one message. In all cases the number of messages is independent of the number of nodes in the VAX cluster system.
Table 5 Performance Summary of the Distributed Lock Manager VAX- 1 1 /780 VAXcluster System Locking Using the Computer Interconnect (CI780)
ENO + DEO CVT (up+down) Local Locking Local CPU 0.6 0.4 Remote Locking Local Remote Elapsed CPU CPU Time
2.7 2.4
1 .5
1 .3 3.9 3.3
MicroVAX I I Locking Using the Ethernet
ENO + DEQ CVT (up+down) Local Locking Local CPU 0.7 0.5 Remote Locking Local Remote Elapsed CPU CPU Time
6.0 5.6 4.8 4.6 8 . 1 7 . 8
• All numbers are in milliseconds
• For Local Locking, Local CPU � Elapsed Time
• ENO refers to a lock operation, DEO refers to an unlock, and CVT to a mode conversion
4 0
and i t i s usual ly the first to d iscover that a mem bership change is requ i red . The need for a mem bership change can resu l t from t i ming out a bro ken connection , or upon d iscovering a new node. All configuration changes arc made using a two phase com m i t protocol to ensure consistency on a l l nodes. To add or remove a node, the coord ina tor descri bes a proposed configuration to the other members . They have the option of agreeing or d isagreeing with the proposed con figu ration .
They w i l l disagree i f they can construct a more opt i m a l configuration based on the nu mber of nodes they can com mu nicate with and on the assignment of votes to those nodes. The resulting VAXc l ustcr system can only consist of a strongly connected group of nodes where every node has a connection ro each of the others .
I n case o f d isagreement , t h e coordinator backs out of the operation , wai ts a random amount of t i me , and t hen initi ates the election protocol aga i n . During this interval other nodes can attempt to become the coordi nator. D isagree-
Digital Technical journal No. 5 September I ')87