From: US2RMC::"EVERHART@arisia.gce.com" 15-OCT-1996 19:34:33.53 To: GCE@arisia.gce.com CC: Subj: perf From: ARISIA::EVERHART 15-OCT-1996 18:57:26.14 To: GCE CC: Subj: perf talk stuff Performance Hints n Kinks (If this talk is given it would be a sort of panel discussion with the following points as things to suggest (others too) so that after that the audience would come up and offer technical points too.) 1. RMS: set RAH/WBH in RAB. Affects nothing but speeds up access. Note that in the [TMESIS] directory of the latest sigtapes or Freeware V3 CD there is code to do a generic intercept of a system service in VMS on Alpha. This can be used to intercept SYS$CONNECT and set these bits in the user RAB to basically provide a systemwide speedup. There should be no other effects, though test it on your system first! 2. SET RMS...more or larger buffers are good. 3. Use FASTIO if possible. Avoids many operations of $QIO, by setting up buffers and control blocks once, using many times. Also can avoid the detour through IPL4, AST at I/O postprocessing time, and a fair bit of spinlock time. 4. Set ACP_DATACHECK to 0 unless you have reason to suspect hardware integrity problems. Causes file operations (directory, bitmap, and indexf) not to use datacheck. Datacheck on SCSI causes the driver to do one-by-one operations (no queueing), and usually costs one disk rotation as well. Major slowdown. Considered unnecessary on modern disks by filesystem folks. 5. For SCSI disks, the disable tagged command queueing bit in drives can be set with sys$etc:scsi_mode.exe. There have been release notes about this. This prevents tagged command queueing from being used for that one disk. Some drives don't implement it right. Do this ONLY when you need it. Tagged command queueing is a major speedup when it is available, and should only be disabled if you cannot get your Brand X drive to work with it properly. Digital drives either don't claim to support it, or support it right. 6. Some SCSI devices (especially CDs and tapes) don't really understand multiple LUNs and will misbehave in ways from generating spurious errors all the way to completely hanging your SCSI bus if something (like autoconfigure) tries to access "other" LUNs. In that case the workaround is to use SYSMAN IO SET to permanently exclude devices like that from autoconfigure (There is a "show" option...see the help for sysman io...to show what is excluded). Then put commands like SYSMAN IO CONNECT DKB300:/NOADA/DRIVER=SYS$DKDRIVER in your SYSTARTUP_VMS.COM to connect the devices. This facility can be used for any device that misconfigures. (Check with L. Szubowics on this one first) 7. MKdriver for 7.1 will be able to position magtape with SCSI SKIP FILE commands. However, to enable this you need to run MKSET thus : $ mkset:==$sys$etc:mkset $ mkset/always mkxxxx: where "mkxxxx:" is your SCSI magtape. This can be done any time, but causes the the tape not to find the first double EOF mark, but the one at the end of data on the tape. On DLT especially it saves lots of time. You can force the system to use the old skip by records behavior with "mkset/never" or to use the old behavior by default with "mkset/per_io" commands. A new function modifier io$m_allowfast to io$skipfile will also allow the skip by filemarks if "mkset/per_io" is set (the default). 8. BACKUP/PHYSICAL is ~twice as fast as normal VMS Backup. There is a utility on the freeware CD to present such backup savesets, provided they are on one tape only, as a virtual readonly disk to VMS. (On freeware as [fastback]). Try it out; if it works on your tape you may be able to save some time. It has been on VMS SIG tapes also. Older versions exist on some other tapes also. (This is for 7.1, should be in relnotes or have been in previously. Won't stay forever most likely.) 9. If your SCSI busses are exceedingly heavily loaded, you may want to boost the VMS7 SYSGEN parameter (to boost the number of seconds for timeout of I/O disconnects) to prevent the load from causing I/O to fail unnecessarily. (Note too that DKdriver will count recovered errors in the error log if they are not "data recovered" errors, though it will retry these operations to get good data if necessary and not report errors to applications unless it fails. 10. Usual stuff. If you use multibuffer I/O, be very careful to have separate IOSB blocks for all I/O requests, and NOT reuse these until the operation is complete. If using ASTs, the IOSB is still a good idea so you can keep track of results. 11. High water marking in some cases (where writing to a file only fills it sparsely) can greatly degrade performance. Sequential writing is generally fine though. This degrading is quite unusual and depends heavily on write patterns. 12. To modulate IO buffer/size you can then use $SET RMS/SEQ/BUF=x/BLO=y ! Convert will listen to this! 13. There are striping drivers for VMS on the sigtapes, and the most recent have virtual disks that work with Digital shadowing, in addition to the Digital striping driver. (On the sigtapesa you may want to look at vwdriver.mar...) Striping in general spreads I/O over several spindles, getting faster access. As will be noted below, the Digital driver knows all the interactions with cluster transition, mount verify, etc. The free ones have limitations which are described in the source. 14. With 7.1, if you want to have code that does its own postprocessing, you can skip the detour through IPL 4 and fork if you set the IRP$m_fastio AND the IRP$M_FINIPL8 bits. Then the postprocessor will call your code via JSB or CALL (doesn't matter) as normal but at IPL 8. You must reset these bits to their prior values as well as restoring IRP$L_PID before finally doing normal postprocessing of the IRP. (Normal fast io finishing will not use IPL 8 if a system routine is to be called, unless this added flag is set.) (This is for those who have listing CDs.) 15. File extend size matters. Many times short files get extended a few blocks at a time. Setting volume extent quantity larger can help here. A fragmentation avoiding tool on the next sigtapes will allow io$_modify extension to be a fraction of the size of the original file. (NT is reported to extend this way, doubling file size, then truncating back on close.) When writing large files this can be a 30% effect in write speed. 16. Interactions with cache are also important. For ISAM files it is recommended that there be large enough caches to hold all blocks of an ISAM directory. MONITOR FILE gives a good running snapshot of how the various caches are doing. 17. VIOC cache caches virtual I/O data and keeps its data in S0/S1 space (at the moment). This can be a limitation, but I/O speed does rise with it. [Check...does installing a file help with keeping frequent I/O files in the cache where they are read from many sites?]. Commercial I/O cache cannot generally cache logical I/O either, but tends to pull space off the free list and thus in principle doesn't interact with system space constructs the same. These must however intercept start I/O and I/O posting in general to gain access to data, and some route IRPs to a cache process so that the locking interactions are specified. Thus ordinary locks - which in some cases generate considerable locking traffic - are used to spread information around a cluster about when caches must be flushed. This extra traffic is in general needed. Beware of products which try to "piggy back" on VMS internal locks; the well known commercial ones do not. These products however generally know no more about disk operations than the "vanilla" VMS interface. There are several DDTAB driver entry points which interact with servers, mount verification, and cluster code which generate significant complexity when something unusual happens...a cluster state transition, a disk goes into mount verify, a bus reset on a SCSI bus occurs, or the like. These entries control details of I/O cancellation, entry and exit into mount verify, controls to tell the cluster to find a new path, and even the low level controls to queue an IRP where the driver expects it on its I/O queue. A fully general intercept should be aware of all of these, not only of the start_io entry point, where I/O is to be cached. The penalty for not doing this can be disk corruption. Before using such a package, you are best advised to check with the vendor to ensure that all underlying driver cluster I/O behavior is preserved. 18. There are a number of tools like the TURBO program (see ftp.wku.edu) which play tricks to speed up I/O. Turbo, for example, locks frequently accessed images into memory. This and similar hacks can be used in particular situations where it is known that some one image will be accessed repeatedly even though periods of non-access exist. This came from a DECUSERVE posting. The danger of such a tool is however that it breaks normal VMS memory management and can cause memory available for other applications to be short or unavailable. This is why Digital has no such thing and why the tool offered as part of VMS is the VIOC cache. This kind of memory wedging can be vital if you have some unusual need, but should be avoided unless you are familiar enough with the system to be able to recognize when you have caused worse performance, hangs, crashes, etc. and to be prepared for them. % ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ====== % Received: from mail13.digital.com by us2rmc.zko.dec.com (5.65/rmc-22feb94) id AA20643; Tue, 15 Oct 96 19:12:13 -0400 % Received: from gce.com by mail13.digital.com (5.65v3.2/1.0/WV) id AA00643; Tue, 15 Oct 1996 19:04:33 -0400 % Date: Tue, 15 Oct 1996 18:59:55 -0400 (EDT) % From: EVERHART@arisia.gce.com % To: GCE@arisia.gce.com % Message-Id: <961015185955.62@Arisia.GCE.Com> % Subject: perf