From: MERC::"uunet!CRVAX.SRI.COM!RELAY-INFO-VAX" 13-APR-1992 14:49:46.32 To: joel@eco.twg.com, info-vax@sri.com, info-multinet@TGV.COM CC: WINTCP-L@UBVM.cc.buffalo.edu Subj: Re: TGV MultiNet vs. WIN/TCP NFS Client Joel Stevick writes: > > > > The semantics of segregating the file data from the > > file attributes mimics the VMS filesystem semantics > > where the attributes of the file (i.e. the file header) > > are segregated from the file data. > > > > The volume-structure management policy for a locally-attached disk is NOT > a reasonable model for a distributed file system - the scope of operations > performed are very different. > Au contraire -- making the policy for a distributed file system the same as for a local file system is essential if you want local users to regard the distributed file system as nothing more than another local file system. If you make the distributed file system a 2nd class file system (i.e. less than one might expect from a local file system) then the user will be less likely to want to use it. The concept of regarding a distributed file system as nothing more than a local disk is why VAXclusters have been so successful for DEC. We wanted to (as much was technically possible) do this for VMS in the NFS world. > For instance, NFS Client does not > have to worry about allocation of physical disk resources, but it > does have to process page faults - a task normally left to disk device > drivers. > The NFS Client does not have to DIRECTLY worry about allocation of physical disk resources -- but so what? The software ($QIO) interface to the local file system doesn't deal in the allocation of physical disk resources either, so this is not relevant. The policies embodied in the semantics of the $QIO interface ARE relevant and to the extent it is technically possible, it is important to provide these semantics. Page faults are nothing more than $QIO Reads (the ONLY thing different about page faults is the use of an internal driver software interface to get the $QIO started). Here too, the scope of operations is EXACTLY the same as that of a local disk. > > > > > It is important to allow arbitrary ACLs to be attached > > to any file. ACLs for granting EXTRA access above and > > beyond the UIC-based access are just a very small part > > of the VMS ACL world. > > . > > . > > . > > You also need to support the > > myriad other non-SECURITY ACLs on arbitrary files in order to > > make the NFS server file system "feel" like a VMS file system. > > > > Again you have assumed that the model for implementing local file systems > is meaningful in a distributed environment. By implying that you support > ACLs for security purposes means that the customer should feel secure > that file accesses from the context of his/her system will be controlled. > NFS is inherently non-secure. That comes with the protocol and if TOTAL security network-wide is a big issue then NFS should not be used at all. Of course, a local-area VAXcluster isn't all that secure either. I can use promiscuous mode to observe SCS traffic and reconstruct interesting parts of the LAVC file systems after some period of time. In a way, NFS and LAVC are not all that different. A rogue system on a local network can compromise either file system. LAVCs rely on the clients to organize and secure the file systems. In this case we rely on the NFS clients to organize (at a higher level) and secure the file systems. > > What happens in the case where a hard link (in a different directory) > is established to a file (with an associated alarm ACE) from a non-VMS > client? When the VMS client sees the new directory entry for the link > will it also detect that the file has an alarm? If not, then the > customer could be trusting that a file is being protected, when it is not. > If this is the case, I hope that your documentation adequately warns customers > of these security holes. These kinds of problems do not occur on a local file > system. > This is really an issue of how to integrate VMS and other NFS (e.g. UNIX) environments. Do you throw out all that the user expects to see with ACLS because of the potential for a security problem? That seems to presume that we know "better" than our customers and must protect them, while reducing the functionality of the system. If you want to share things between VMS and UNIX you have to be prepared for the lowest common denominator in client security. It is manageable and ACLs should not be discarded because of this. If you don't want to allow the ACL security semantics, that is a mount option -- but doesn't throw out the ACLS, which are useful for other things as well. > > > > Yes, PathWorks using the NFS client as a gateway to NFS > > servers is a VERY useful function. The resources used to > > store the ACL information is typically much smaller than > > the resources used to store the actual file itself. Even > > if it were as high as 20%, this is likely to be a trade-off > > a customer would want to make for the transparent PathWorks > > access to NFS servers. Note, that nothing prevents a client > > implementation to have mount option like /SEMANTICS=PATHWORKS > > so that PathWorks generated ACLs only have to exist within > > the "mind" of the client itself and don't have to be stored > > on the NFS server at all. > > > > I'm not sure how you are measuring resource consumption. If you are > considering disk space as the resource, then perhaps 20% is a reasonable > number. However, if you consider that file systems are configured > to support a FIXED quantity of files, having a meta file for each > PC file can become significant overhead. In reality, file servers usually run out of storage space before they run out of file space. This is sort of like saying: "Hey, you started using the file server and it ran out of space!" The whole point of the file server is to provide the space for you to use. If you need to place administrative limits on the space, do so. Quotas work over the NFS client. This is a decision you want to leave up to the customer, don't rule this feature out just because it might use up resources -- we don't feel it is our role to decide this for a customer. > > Having a /SEMANTICS=PATHWORKS would place a SEVERE limitation on the use of > VMS systems as gateways. For example, if the ACL information > is stored at the client, then you could not have multiple groups of PCs > (i.e. through more than one VMS gateway) sharing files. > I was NOT suggesting that the ACL information be stored at the NFS client. That would NOT be a good idea. What I meant was that in the case of pathworks the ACLS can actually be deduced at runtime and do not need to be stored at ALL! The client could compute them and act as though they were there, obviating the need for most meta-files in this case. > > > > It is hard to imagine what kind of headache (other than what > > Reece mentions below) is being referred to. Files are files, > > that is what the server is there for. > > > > Customers typically acquire servers based on a projection of capacity > requirements. It does not seem reasonable to tell custumers that > they must regen existing server file systems just because they > are introducing your VMS NFS Client into the environment. This > would mean that they have to take their servers offline, back up > the file systems, re-make the file systems, and restore the > user files. THIS IS A HEADACHE. > Sorry, introducing ANY client into the environment may change the resource requirements on the server. Any server of the sort that has the maximum number of files pre-configured can run into trouble when lots of small files are stored. Having LOTS of meta-files compounds the problem but does not make it fundamentally different from not using meta-files. You are also assuming the worst case, where a meta-file is required for every file created. This is the same as assuming the worst case where a client just creates lots and lots of small files. Meta-files are there to ensure VMS semantics when other mechanisms are unable to. In reality you end up with few meta-files even when storing lots of VMS record attributes. > > > > > Now for the adminstrative burden on a UNIX NFS server... In the > > unlikely event that you have some of these unwanted files hanging > > around (and over 2 years of having the MultiNet NFS Client in the > > field has not brought any complaints about this) the following > > very simple shell script (took me maybe 5 minutes to generate it) > > will find all the files that don't also have a matching data file. > > You can then do anything you like with the file (including delete it) > > by just changing the "echo" line in the script to do something: > > > > Consider the following (real) example: > > One of our customers has a NETWARE-based PC network. The > PC users operate on files that are managed using a VAX-based > document control system. Word processing documents are created on a > VAX and then inserted into a document control library. The documents > are then moved to an NFS disk which is served by a Novell server. > > A "monitoring" application performs some additional processing on the > documents and then moves them (via NETWARE) into an area which can be > accessed by the PC users. This process handles thousands of small documents > weekly. > > These files are WORDPERFECT documents and are stored using 128-byte > fixed length records. The meta file approach would not be useable > in this case. Because the monitoring application knows nothing > about the meta files, the file system would quickly exhaust its resources > and user operations would be halted. Would you suggest that the > customer pay for their software to be modified, or perhaps you > could provide an equivalent NETWARE script for cleaning up the > orphan meta files. Most customers do not have resident NETWARE > expertise to consider either of these options. THIS IS A HEADACHE. > This is an interesting case, and one that is VERY similar to using CD-ROMS (because you can't do anything to change the files on the CD-ROM). Now in this case, you MUST have abandoned the idea of putting meta-data in the file, because then WORDPERFECT on anything other than the VAX would not be able to interpret the file correctly. So you have taken the 1st step towards what we have done and provided a fixed set of attributes for the files in the NFS mount options. You have isolated the record attributes from the file data. Now this is a special case where this scheme works quite well. You have a single purpose in mind for the file system and you specify things accordingly. Now, if you want to store other kinds of files you can't unless you mount the file system again with other attributes. A better approach would be to use the meta-file scheme but have a mount switch of /DEFAULT_ATTRIBUTES=FILE.FDL which would have an FDL description of the attributes to be given to files that have no meta-file. > > > [Slick little UNIX shell script omitted here for brevity] > > > This shell script may work fine for UNIX, but what about NETWARE, or > any of the many other non-UNIX servers? > Actually, the shell script was a "straw-man". There is no need for ANY kind of script in the real world. > > > > A lot of work is done in the MultiNet NFS client to minimize the > > need for these meta-files. In reality you will find that this > > tends to increase the number of files by a few percent. The worst > > case would be for something like PathWorks that wants to put ACLs > > on everything -- but, as discussed above there are VERY good solutions > > to this problem. > > > > Do you describe in your documentation which file types require meta > files? This is an important consideration for customers. In the > case of our customer (described above) the files are stored using fixed > length 128-byte records. Can you store preserve these attributes > without the use of meta files? This is an important question considering > the volume of document processing performed. > The file types that require meta-files are irrelevant. They just take less resources on the server. I think my suggestion for /DEFAULT_ATTRIBUTES handles this quite nicely and in a more general fashion than Reece described for the TWG NFS client. > > > > The same scheme that minimizes the occurance of deleting > > a file from the server side also serves to minimize this > > occurance. A UNIX user moving the file will not move the > > hard link that has the version number -- certainly not > > accidentally. Of course, the client takes care of all the > > Again you are assuming that file servers are UNIX-based. What > about NETWARE, VM, etc.? > The vast majority of NFS servers ARE UNIX-based or at least provide the necessary UNIX (NFS) semantics. Otherwise UNIX clients don't see the file servers as fully functional file systems. We do have mount options designed to handle these kinds of servers and you DO get reduced functionality in the VMS file system, but at least you get as much VMS functionality as possible given those file servers. There are, in fact, better solutions for dealing with the NETWARE world and I will comment on that sometime in the near future. > > > accidentally. Of course, the client takes care of all the > > details when a file is moved using the VMS NFS client. > > You will also note that the format of the meta-files were > > carefully chosen to make recovery very easy in the unlikely > > event of an accident. The format is the same as a VMS FDL > > File Definition Language file -- so VMS utilities can be used > > analyze a similar file and get a meta-file that can be placed > > on the server. > > > > What if the meta file is lost, or the customer does not know from which > directory the file originated? The customer may not know the original > RMS attributes for the file. THIS IS A HEADACHE. > In the VERY unlikely event that a meta-file is lost, it is HARD for me to imagine a situation where there is an application that uses some special kind of file of which there is only ONE such file in the entire world and that was the file that just lost its attributes AND it was never backed up so you would never know what the attributes are and you couldn't somehow figure it out! The directory from which the file originated is irrelevant. The TGV NFS Client purposely chose the FDL representation to make dealing with these problems quite straightforward (again, I would point out that in over 2 years of release the NFS Client has not had these problems). > > > > You would need to allocate a fixed-size area for the meta-data > > which would be VERY expensive to expand in a file were it ever > > necessary. Not being able to expand the meta-data area is a > > problem -- as ACLs and other meta-data can grow beyond whatever > > fixed size you decide to use. Also note that just as in the > > meta-file technique, you use up file system resources to store the > > meta data. The next problem, which Wollongong on partly solves > > > > Since we don't have to be concerned about the management of on-disk > resources, the amount of storage required for meta data is quite small - much > less than the amount of space consumed by a typical server file system > "inode". More importantly, the system manager does not have to regen > the file system to accomodate this meta data. File systems are typically > configured into two regions - one for storing control information such > as "inodes", and a much larger region for general use. Our meta > data resources are obtained from this larger pool. > Actually, "inodes" (nice to see that you assume UNIX servers too) consume very little disk space. The only issue would be the pre-allocation of inodes in the file system and I think I have already addressed this issue. The system manager most certainly does NOT have to regen the file system to accomodate TGV's meta-data. > > > > meta data. The next problem, which Wollongong on partly solves > > is that there is a large class of VMS files that have identical > > on-disk representations as their UNIX counterparts. It is very > > important to be able to share the data. Just allowing a single > > mapping from stream format to fixed-length records is not enough. > > Your scheme does not work, for example, with VMS and Ultrix > > bookreader files -- the MultiNet approach allows you to transparently > > access Ultrix bookreader CDs from VMS bookreader. We have run > > across many examples of this. Once you have to place the meta-data > > in the data file, the UNIX applications will not be able to read > > the files. > > > > I assume that you are refering to our /STREAM= qualifier which > allows stream files to be represented to VMS NFS Client applications > as fixed length record files. The reason that we implemented this > was that one of our customers has an application that was designed to > operate on files which have been obtained using an FTP command procedure. > FTP stores them as fixed-length 512 byte record files. The customer > did not want to have to pay someone to modify their application so that it > could operate on stream files. Hence the /STREAM qualifier. > > There is absolutely no meta data used under these circumstances. > My opinion is that is a "hack" for a single customer that does NOT serve the whole customer base well. The basic idea of being able to specify the attributes to be applied to files that have no meta-data (or meta-file in TGV's case) seems like a good one. But a more general approach (as described above) would be much better. I would also note, as we seem to be getting into a religious war over how to store VMS attributes, that in this case you are NOT storing attributes in the file itself (this I applaud). > > Having a mount switch to force the conversion of fixed-length record > > files to stream files is a problem when you have a mounted NFS disk > > that you are using for several purposes, some of which require your > > conversion in order to work and some of which break when the conversion > > is done. Forcing a VMS user (possibly through a system manager) to > > make the decision on these semantics takes away from the transparency > > people will want to see in the NFS client. > > > > It would be easy enough to handle this through the use of device logical > names. However, I am not aware of any customers who have even encountered > this issue. > Again, this strikes me as a bit of a band-aid -- rather than coming up with a good general mechanism for dealing with this problem. > > It also comes to mind that VMS file version limits are going to be > > problem here as well. When you do a VMS SET/FILE/VERSION_LIMIT=n > > it sets a version limit for a particlar file in a particular > > directory. The semantics DO NOT allow you store that information > > in the file itself -- this version limit is a property of the > > directory entry, not the file. > > An interesting point! We do not claim to support version limit enforcement > at the file-level. We support version limit enforcement at the > directory-level. In this case RMS uses the version-limit field in the file > header if there is none specified in the directory entry - we emulate this > field. The interesting aspect to this question is that in our design we > did not completely eject the meta file approach. If a file system is > mounted with version limit enforcement enabled (it is disabled > by default), then a meta-file (in that directory on the server) is > created to store the version limit information. > Well, you use meta-files here to store the information -- why only go 1/2 way and support directory version limits? Use MORE meta-files and support MORE VMS semantics! > > > > This really only helps with NFS servers that have eNFS algorithms > > that can delay updating certain file attributes on the server's > > disk when there are other write requests coming in for the same file > > before the initial write completes. NFS servers using accelerator > > boards and regular NFS servers don't really benefit from this. > > > > We have found that on our in-house machines, pipelining consistently achieves > a 50% improvement. Among the servers tested are SUN 3/60, SUN Sparc I, > SUN Sparc II, and various DEC servers. I have not encountered ANY file > servers where pipelining did not make a significant performance impact. > I would be very interested in packet traces to see what the timings look like. Back-to-back 8kb writes should do just about the same with or without pipelining. Are you sure you are not seeing some other problem that is being masked by the pipelining? > > > > Unless I am mistaken the pipelining is accomplished by telling the > > application that a write operation has been completed on the server > > even though it has NOT. If so, this is a problem -- as the semantics > > of IO$_WRITEVBLK to a file say that the I/O operation does not complete > > until the data is safely on the disk. There are many applications, > > like databases (and even RMS's indexed files) that can get corrupted > > files on a client crash if this scheme is used. The ability to have > > more than one outstanding write request on a file on VMS is selected > > (or not selected) by the application -- usually through RMS defaults. > > RMS (or the applications themselves) do asynchronous I/O which would > > allow a non-pipelined NFS client to get the same effect but only > > when it was appropriate to. So, a Multi-Threaded NFS client would > > be able to get the same improvement in network bandwidth utilization > > WITHOUT violating the semantics of the VMS file system. > > > This is true, and it is discussed in our documentation. > > The pipelining feature was added to improve performance for large > sequential transfers, e.g. BACKUPs. Customers don't have to be > concerned about configuring RMS parameters etc. > > For cases where customers do not want to use pipelining, it can be disabled. > Good. I still claim that having true asynchronous VMS I/O gets you the exact same pipelining effect without having to specify anything and without having to worry about any of the consequences of the changed VMS $QIO semantics. Customers are only concerned about RMS parameters to the same extent they would be when using a local disk and trying to optimize I/O performance. A typical NFS operation mix (except for Backups -- and I think that using a remote Magtape Client from VMS is a better solution for doing backups than using an NFS Client) is 5 reads to every write, so it is still not clear to me that pipelining could gain much except in very specific circumstances. > > There are, I think, other issues in NFS client performance that have > > not been touched upon. > > . > > . > > . > > > > We also have such algorithms. Any serious NFS Client design for VMS would > have to consider this. We have implemented a configurable directory entry > caching scheme. We have made it configurable so that the customer can > trade off cache resource usage for response time. For instance, operations > on small directories will not see a significant performance benefit from > caching, so there is no need to consume any resources. > Any directory that contains more than 1 file will see a performance improvement from a directory cache. Actually even 1 file will see a performance improvement, because the only NFS operation required in most cases will then be a GET-ATTRIBUTES to check for cache consistency. The TGV NFS Client puts its cache in virtual memory with some good cache management to minimize paging, so the resources used for the cache should not be much of an issue and no tradeoffs need to be made. The algorithms I was referring to were not the cache, you can't have a usable VMS NFS client without one, but the algorithms that greatly reduce the number NFS requests the VMS NFS client must make of the server to be able to process incoming $QIOs (in particular directory lookup operations). > > When the number of files in a server directory becomes large (we have > > customers storing many thousands of files in each directory) the > > computational complexity of just sorting the names of the files (they come > > from the server in random order and must be alphabetically sorted for VMS) > > is a problem. > > . > > . > > . > > Again, any serious NFS Client implementation will address this. Our > directory entry caching scheme prevents the redundant computations > that you are refering to. During our initial field test, one of our > customers amused us by regularly storing 5000 entries in a single directory. > The response time for creating and deleting files was very reasonable - > on the order of a few seconds - on an active system. It is important > to clarify that the "2 seconds" you mentioned above is "compute time". This > is not the same as observed "wall clock response-time". > No -- that was wall clock response-time and that was for the first time a directory was accessed. Creating files after that 1st time was considerably faster. David