From: Glenn C. Everhart To: Lenny Szubowicz Re: IRP structure Date: 24-March-1997 Problem: In adding services to VMS I/O (in my case, for multipath) via I/O intercepts, it is necessary to modify some IRP entries during the intercept period, and replace them so that the IRP can be post processed in the same context as it was received. The traditional way to do this has been either to add per application fields to the IRP definition (which has resulted in a very large IRP) or to allocate a structure in pool and somehow find it from the IRP. Pool allocation can be slow, will certainly tend to fragment pool, and growing the IRP with unique fields is really feasible only at .0 releases of VMS. However, speed and complexity of doing interception drop noticeably if a "context stack" is available within the IRP itself. I would like to allocate this and a few other fields at the end of the IRP definition for use generally where available. Goals: * Allow fastest feasible add-in of processing of I/O. * Function correctly when old length IRPs are seen * Function correctly where IRPs are copied up to the "old" length from the longer IRPs Alternatives: There seems to be no alternative, if anything is added to the IRP, but to have sufficient checking information to allow one to determine when the IRP truly has the information and when it has not. Moreover, a finite stack may overflow and provision must exist for handling that overflow, as well as finding "old style" IRPs, by allocating structures in pool. Any scheme to store new information in fields not currently used within IRPs risks some application not knowing about the fields and even using them itself. The IRP$L_EXTEND field could be used to add an IRPE, of course, but since postprocessing must occur in layers and IRPEs are used for many other things, finding the "right" IRPE seems to involve a search necessarily. Using some other list and an access method for finding "this" IRP's data on it seems cleaner and less likely to cause problems for other software that may use the IRPE list. Thus the "fallback" action can as well store context in a separate structure not required to be pointed at directly by the IRP. Design: Consider adding the following to the current IRP: 1. A single bit status for the IRP, irp$v_gotstk, indicating the IRP has a stack area. 2. The irp$w_size field will be larger than old IRPs 3. The following new areas will be added after the current end of the IRP: IRP$L_CURCSP Address of the current IRP stack frame IRP$L_STKFLGS Flags longword. Defined at first is irp$v_onstack, indicating the context data is on the stack. On stack overflow this field is cleared. IRP$A_CTXSTK Area to hold the context stack. The field IRP$L_CURCSP must point into this IRP's IRP$A_CTXSTK area for validity, and should be able to hold several context sets. The context area is used by multipath, and twice again for snap capable disk. The context area used for multipath is 8 longwords in length, holding the values of IRP$L_CURCSP and of IRP$L_STKFLGS at entry as well as save areas for ucb, media, PID, and stat fields. Thus a size of perhaps 50 longwords would seem advisable. How To Test For A New Format IRP: An IRP which has the irp$v_gotstk field set, is long enough (irp$w_size big enough to hold the stack area), and whose IRP$L_CURCSP field points into the area IRP$A_CTXSTK for this IRP is deemed to be a valid "new" one, and otherwise is not. What Happens with Old IRPs: Note that an old length IRP will fail this test. Also an IRP which is copied from a new IRP will also fail (CURCSP not in CTXSTK) unless of course the copy updates the context stack, which would require knowledge of the new fields. Therefore context information in the IRP would be used only where it is valid. The only practice which could get into trouble would be if someone allocated an old sized IRP using a constant size, then copied the IRP from a new IRP including copying the 12 byte header. This would cause the size to be incorrect. While the copy would still not be treated as a new IRP (the copy's stack pointer would be invalid), the size deallocated would be different from the allocated size. This kind of copy is however a highly suspect practice and should be stamped out if it exists in some third party code. I do not believe it is in fact used. The more common practice would be to allocate up to some constant size and copy up to that size. That would fail to be mistaken also. If IRPs created by fast-io or by sysqioreq have the new format, most IRPs in the system will have the extra storage needed. Usage: When an IRP is encountered that is in the new format (per the tests above), if there is room on the stack for the context one saves, it is saved there and the curcsp field is updated to point to the next free location. When context is used, the old curcsp field gets reloaded. Should the stack be missing or not enough space left to hold a full context, the irp$v_onstack flag is cleared (if present) and the curcsp field is left alone (if present). Another structure is allocated in pool to store context and linked to a queue maintained by the intercept (or a hash block is allocated and used) and the context is stored there. This hash block or queue element must have the IRP address so that the correct one can be found at post processing time. The context block must also have the prior state of the flags so that when postprocessing is done on the IRP, the flags can be reset. That way, if the stack overflowed, the old flags will indicate that the next context down is on the IRP context stack. At post processing when the context information must be restored to post process the IRP in the correct device context, the context location is determined as it is when the information is to be stored and is reloaded into the IRP prior to completing its posting. If an auxiliary structure was allocated, it must be freed at post processing time too, once its data is reloaded. With the new code being added to allow post processing to be handled at IPL 8 by intercepts, this gives a low overhead way to process I/O interception and add new processing without much change to any drivers. Any code that produces old style IRPs will find that they also work, and are not interfered with by this logic. Any cloned or copied IRPs will work as before with the "auxiliary structure" path of this logic. Most IRPs will however be able to use the IRP itself, with a few tess added which involve no PALcode calls. Only code that corrupts the IRP header could get into trouble, and that is a tiny risk due to a basically broken coding practice anyway. Conclusion: This scheme is low in risk, and provides significant performance and orderliness benefits to future I/O level intercepts including those wanted by EDO. It will not break existing drivers or code, but will allow new intercepts to run faster than otherwise possible and will avoid need in the future for yet more dedicated IRP cells.