From:	MERC::"uunet!WKUVX1.BITNET!MacroMan" 31-AUG-1992 21:28:29.17
To:	uunet!"macro32@wkuvx1.bitnet"
CC:	uunet!JON
Subj:	DDB$L_DDT != UCB$L_DDT

Summary: What reasons are there for not changing the
         UCB$L_DDT field of a disk device to point to a
         "cloned" DDT with a different startio vector?

If you aren't interested in device driver internals, you may want to
skip the rest of this (long) article.


On Aug 11th, 1992 I posted some comments about serializing access to a
DRM-600 6 CD-ROM mini-changer.  My plan is to modify the CDDRIVER that
was submitted to DECUS.  This is related to that posting.

Following is the background leading up to my question.

-----------------

Some random comments about CDDRIVER as was distributed on the VAX89B1
DECUS tape.  CDDRIVER is a cache driver that was written by Paul Sorenson
of American Electric Power (AEP).  CDDRIVER is an intercept driver that
is transparent to application I/O requests.  I will refer to this as
CDDRIVER.89b1 or just CDDRIVER.

CDDRIVER effectively replaces the startio routine of the disk drive that
is being cached.  It does this by modifying the startio vector in the
target disk's Driver Dispatch Table (DDT) to point to the CDDRIVER
altstartio routine.  Since the address of the original startio routine
must be saved, and because caching requires additional data structures, a
place must be provided to hold this additional context.  Normally, a
driver writer would just provide UCB extensions for this purpose.  We
can't extend the disk device's UCB, so a CDDRIVER (CDA) unit is
associated with each disk being cached, and it's UCB is used as an
extension of the disk's UCB.  The problem to be solved is the following:

    Given the address of a disk UCB (in R5), find the address of
    the corresponding CDAx UCB, or determine that no CDAx unit is
    associated with the current device.  Also, if all units for
    the disk driver are being vectored through the replacement
    startio routine, then the original startio routine address
    must be determined, so that IRPs for non-cached drives using
    the same disk device driver can be passed on to the correct
    startio routine.

In CDDRIVER.89B1, the DDT startio vector of the disk device driver is
changed to point to the CDDRIVER altstartio routine when the first disk
using the disk driver is cached.  Since there is a single DDT for the
disk driver, all disks using the same driver are also revectored through
CDDRIVER's altstartio routine.  As a result, the replacement startio
routine must check to see if this is a request for a cached disk.  It
does this by scanning all the CDAx units to find one that is associated
with the current io request, i.e.  does the CDA UCB$L_CD_DSKUCB equal the
UCB for the current IRP (what is in R5).  This strategy takes the most
time for disks that are not being cached, because it must verify that
every CDAx unit is NOT associated with the current request.  This is one
reason to limit the number of devices that are being cached.  The current
driver is limited to 8 units.

In an effort to reduce the impact on non-cached disks and to allow for a
larger number of cached devices, I started thinking about possible
changes to CDDRIVER.

The problem of affecting non-cached devices using the same driver could
be eliminated by cloning the driver's DDT, modifying the startio vector
in the cloned DDT to point to the replacement startio routine, and then
modifying the UCB$L_DDT pointer of the cached disk to point to the cloned
DDT.  Since the original DDT is no longer changed, non-cached devices
will not be affected in any way.

Cloning the DDT also eliminates the need for the replacement startio
routine to determine whether the I/O request is from a non-cached drive,
since only cached disks' I/O should ever reach the replacement startio
routine.  This allows us to make some other optimizations, since we don't
have to remember the original startio routine for non-cached disks.

To handle a large number of cached devices without a linear increase in
the amount of time to find the associated CDA UCB, one possible method
would be to hash R5 to produce a hash table index for the associated CDA
unit queues.  This should reduce our average search substantially.  The
number of hash queue headers could be controlled by a define in CDDRIVER.
For this to work we have to guarantee that only cached devices will use
the replacement startio routine, otherwise we wouldn't find an associated
CDA unit, and we wouldn't be able to recover the original startio address
(without looking through all the queues for a unit using the same
driver).

Another possible optimization would be to have a periodic routine sort
entries in the queues based on number of I/O operations (i.e. a repeating
system TQE would synchronize with the driver, examine the queues and
counts and reorder them).

If you really feel lucky, you can avoid the reverse search by breaking
the rules.  If we can find a longword field in the UCB that is not being
used by the disk driver, then we can use this field to store the UCB
address of the associated CDAx UCB.  The two likely candidates for this
are the UCB$L_AMB (associated mailbox UCB address, which is currently not
used by DEC for disk devices), and UCB$L_XTRA (which is documented as SMP
alternate STARTIO wait, whatever that means).

I talked with Paul Sorenson on August 13, 1992 about CDDRIVER.  He hasn't
made any changes to the CDDRIVER since it was submitted to the VAX89B1
tape.  We discussed the possibility of cloning the DDT and then modifying
the UCB$L_DDT field to point to this copy.  We couldn't think of any
obvious problems.  I then asked about using a field in the disk UCB.
Paul thought about it a second and then came up with a much safer idea.
If we are going to clone the DDT, why not put it in an extension of the
CDDRIVER UCB.  We will use up more non-paged pool, since we will need the
info in each CDA unit's UCB, instead of one copy per disk device driver
type.  However, when we change the DDT$L_UCB field of the disk UCB to
point to the cloned DDT, we will have a pointer to a structure within the
associated CDA UCB.  Now, a negative offset from the DDT structure will
return the address of the associated CDA UCB.

      Questionable practice

UCB$L_CD_ACDUCB = UCB$L_AMB  ; define our symbol to overload an "unused"
                             ; UCB field.  Another possibility UCB$L_XTRA

        MOVL    UCB$L_CD_ACDUCB(R5), R0    ;; Get address of associated
                                           ;; CDA unit UCB

++++++++++++++++

      Safer practice

        MOVL    UCB$L_DDT(R5), Rx          ;; Get address of cloned DDT
                                           ;; within the associated CDA
                                           ;; units UCB extension.
        MOVAL   -UCB$L_CD_CLONEDDT(Rx), R0 ;; Now get the address of the
                                           ;; associated CDA unit UCB.

        Where Rx is a scratch register.  It could be R0, but under most
        VAX implementations, using a different register will be faster.

      Another alternative:

        SUBL3   #UCB$L_CD_CLONEDDT, -      ;; Get address of associated
                UCB$L_DDT(R5), R0          ;; CDA unit UCB

      Ok, which of the previous two is faster?  They both require 10
      bytes.

++++++++++++++++

This all leads to the following options.

1.  Be "safe".  Don't clone the DDT or change UCB$L_DDT in
    cached disk UCBs.  This precludes the use of hash queues.
    All CDAx UCBs will have to be explicitly searched for the UCB
    of the disk drive.

2.  First level optimization.  Clone the DDT and change UCB$L_DDT
    in cached disk UCBs.  This isolates non-cached units from the
    changes.

3.  Second level optimization.  Requires first level
    optimization.  Hash the disk UCB address and chose a queue of
    CDAx based on the result.  This should substantially reduce
    the average number of CDAx UCBs that must be searched to find
    the associated Disk UCB.

4.  Third level optimization.  Requires first level optimization,
    and the DDT must be cloned into the associated CDAx UCB.  The
    address of the associated CDAx UCB is computed from the
    address of the cloned DDT.  This is what I am planning to do.

5.  Fourth level optimization.  Requires first level
    optimization.  Overload an unused field in the disk UCB to
    contain the address of the associated UCB.  Not recommended.

Note that options 1 through 3 can be further optimized by having a
repeating system TQE routine sort the queue(s) based on number of I/O
operations.  The frequency of this routine would need to be determined, I
would start out at something like once every 5 minutes (300 seconds).

Now on to the question:  What reasons are there for not changing the
UCB$L_DDT field to point to a cloned DDT with a different startio vector?
I realize that the I/O database won't be consistent, in that DDB$L_DDT
won't be that same as UCB$L_DDT for devices that are being cached.  What
possible problems could that cause?  It would be possible to clone a
dangling DDB (i.e. there would be no path to it via IOC$GL_DEVLIST and
DDB$L_LINK fields, and its own DDB$L_LINK field would be 0) and have
UCB$L_DDB point to that, but I can't think of a good reason to do so.  I
definitely want any new device that is created to get the real disk
driver DDT, so I couldn't change the original DDB's DDB$L_DDT address to
point the the cloned DDT.

Can anyone think of problems this would cause?  Are there other options
that would be better?

Jon Pinkley  jon@clevax.wec.com  ...uunet!tron!clevax!jon (216)486-8300 x1335