From:	SMTP%"RELAY-INFO-VAX@CRVAX.SRI.COM" 12-JUN-1993 11:53:44.34
To:	EVERHART
CC:	
Subj:	Re: Difference between VMSclusters and NFS

X-Newsgroups: comp.os.vms
From: zrepachol@cc.curtin.edu.au (Paul Repacholi)
Subject: Re: Difference between VMSclusters and NFS
Message-Id: <1993Jun8.055749.1@cc.curtin.edu.au>
Lines: 59
Sender: news@cujo.curtin.edu.au (News Manager)
Organization: Curtin University of Technology
Date: Mon, 7 Jun 1993 20:57:49 GMT
To: Info-VAX@kl.sri.com
X-Gateway-Source-Info: USENET

In article <1671@se.alcbel.be>, mvbr@se.alcbel.be (Marc Verbruggen) writes:
> Now, my question : In what sense do these 3 technologies differ ? Is the lock
> manager concept in clusters crucial ? Is there no such thing within Unix or the
> Distributed Services ?

The distributed lock manager is one of the key bits of a cluster.
The REAL workers though are the MSCP, SCS and the conection manager.
A cluster is very 'real' in that its members co-operate to keep it
intact over time. Hands up all those who have forgoten about a diskless
node in some forgoten corner, and had to do a cluster re-boot again ;-(

SCS and the conection manager provide a set of comunication services that
MSCP and the DLM can rely on. With SCS and MSCP you can do nearly anything
NFS can do, including screw up mightly! The problem is, of course, co-
ordination of the use of resorces. So the conection manager maintains the
integrity of the conections, and identifies alternate paths to each
devise. This is essential, even if you don't want the redundancy, as you
must know if your ulimate target is the same devise, or a different one
with a simular name. This is why the cluster manuals rave on about
allocation classes and the like.

When you have each devise identified, you can then control it use. Note
that clusters use 'discressionary locking', not enforced locking. You
CAN step around the lock manager and access something without any
syncronization to the rest of the cluster, or issue a lock request for
a key resource name and hang the cluster as everone else waits for you to
free it. ( This is the common cause of cluster hangs. Something in one
machine gets the lock on SYSUAF or the system volume and gets stuck. Then
all the other machines hang, waiting for the lock to be freed. )
Note, it is not EASY to bypass locking, just possible. For instance, RMS
does all the record and index locking in files. By not useing RMS, and
going straight to the XQP with QIOs you can open the file and access the
data. You just have to do ALL of the data access yourself.

The lock manager can also be used to comunicate across the cluster
by using the 'lock value block' in the ENQ/DEQ calls. This is the key
( along with the SCS services ) to things like OPCOM, the que manaager,
license managment, etc. It is also VERY usefull for running a critical
aplication over a cluster. One copy can create a resource name and put
an exclusive lock on it. All other copies ( on the other nodes ) hang
waiting for the lock. If the active node fails, *ONE* other node will
get the lock and can proceed.

Infact the DLM had an unexpected plus in non-clustered systems. Pre
VMS v4, all volume managment was done by one process running F11BACP.
This process was responsable for all file level lockinf and syncronisation.
One effect of this was that it was nearly single threaded for all the
system. For v4, it HAD to use the DLM to sync with other processes in
other cluster members, so it could also use the DLM for local sync. Now,
as much of the global state is in the lock manager, there is no reason
not to let EACH PROCESS independantly manager the file structure, and
use the DLM to sync against ALL other processes in the cluster. Hence
file processing is now multi-threaded! and only has to pause to snyc
if there is an access conflict. It also enables each node to pre-
allocate thing and cache them so as to speed things up. If the node
goes down, the inforation in the LVBs can be used to recover and clean up.

~Paul