From:	SMTP%"Info-VAX-Request@Mvb.Saic.Com" 13-SEP-1994 09:04:34.59
To:	EVERHART
CC:	
Subj:	Re: SPAWN performance (was: posix)

From: "GWDGV1::MOELLER" <moeller@gwdgv1.dnet.gwdg.de>
X-Newsgroups: comp.os.vms
Subject: Re: SPAWN performance (was: posix)
Message-Id: <30139400@MVB.SAIC.COM>
Date: Tue, 13 Sep 1994 02:23:39 +0200
Organization: Info-Vax<==>Comp.Os.Vms Gateway
X-Gateway-Source-Info: Mailing List
Lines: 89
To: Info-VAX@Mvb.Saic.Com

Back from vacation, it's nice to see somebody else writing about
SPAWN slowness and its reasons!

It was me who several times brought up a "scheduler patch"
which happens to cure the SPAWN problem. Let me explain how
I arrived at this patch ...

At school, I learned a few things about what to expect from
a time-slice based preemptive scheduler:

(1) When a computable process has used up its time slice ("quantum end"):
	- (maybe) drop the priority
	- put it at the tail of some scheduler queue, so that 
	  other processes of the same priority can get their share
	- reset time-slice to a new QUANTUM.

(2) When a computable process is 'preempted' by a higher priority process:
	- have the process proceed as soon as possible (given its priority)
          with the remainder of its current time-slice, so as to _minimize_
          the effect of that other process interfering.

(There's a third case of a process giving up the CPU voluntarily which is
 not so clear-cut due to priority boosts - this policy doesn't matter here.)

Some years ago, when I was looking into the VMS scheduler in order 
to understand some peculiar behaviour not relevant to this discussion, 
I expected it to behave approximately as described, but found that it
doesn't. Instead, it had no provision for (2) above, but treated 
preemption practically the same as "quantum end". This means that
a process that _happens_ to get preempted (note that this is an event
unrelated to that process' behaviour!) ends up at the end of the 
scheduler queue, and has to wait for competing processes (at equal priority)
to get their full QUANTUM before it'll be scheduled again. THAT'S NOT FAIR!
Ok, most of the time those competing processes won't get their full QUANTUM
either, being subject to preemption as well, but note that this also defeats
the idea of priority boosts (you do want a process to get some _quantum_
at higher priority sometimes, not a random amount of CPU time).

When I saw this on VMS V5.4, there had been some inactive & unfinished code 
around in the VMS scheduler to support more than one scheduling strategy 
since VMS V5.0, so it was practical to patch it to handle the case (2)
as described above. I did so - purely to confirm that what I'd learned
couldn't be all wrong - and indeed there was no visible effect at first.
Then, several days later, I realized that I wouldn't see the SPAWN command
(from inside MAIL, typically) take as long as it often did before ...

I admit having trouble with the explanation why the traditional VMS scheduling 
affects SPAWN. Basically, the parent process does a significant number 
(dependent on the number of symbols, logicals, and key definitions)
of mailbox write QIOs (_not_ QIOWs, the only synchronisation is via 
"ressource wait"). Once started up, the subprocess reads from the mailbox 
via QIOW. In my understanding (which could be wrong), the parent 
doesn't voluntarily give up the CPU during the write/read cycles, 
but is the one that gets preempted by the child which gets a priority boost 
as soon as a QIOW(read) completes. After very little work, the child waits 
at the QIOW again. Now, with traditional VMS scheduling, the parent
is at the tail of its scheduler queue, and has to wait for all compute-bound 
processes at the same priority (i.e. DEFPRI) to use up their QUANTUM, before
being allowed just one more QIO (which will start the cycle again).
With my 'modified' scheduler, however, the parent would be allowed several
QIOs (and preemptions) before having to yield to other computable processes
due to an actual quantum end.

The scheduler strategy that took a somewhat longish patch under VMS V5.4 
became a 'feature' under VMS V5.5 - in the context of POSIX threads.
Just by coincidence (:-) POSIX threads allow for a choice of scheduling 
policy: non-preemptive, preemptive, and (on VMS only) "VMS traditional".
Guess what the "preemptive" policy does ...
The "one-line patch" (as I've been quoted lately) consists of nothing but
treating all "VMS traditionally scheduled" processes like "POSIX preemptive" 
threads - not an unreasonable choice as I've found, but not exactly the
scheduler authors' intention.

Seeing that there _may_ be a reason to stick to VMS traditonal scheduling
(PLEASE someone tell me one, other than "because it's traditional"), 
I'd really like the default scheduling policy to be determined by a SYSGEN 
parameter. The parameter itself has been there since at least VMS V5
(SCH_CTLFLAGS), only that the code referencing it has consistently
been commented out (up to and including V6.1). Otherwise there'd be 
a bit in this bitmask called something like PREEMPT_RESUME and having 
approximately the same effect as my "one-line patch" ... 
See lines 374 and 375 of the OVMS VAX V6.1 listings!

BTW, since we're currently stuck at 5.5-2, I don't have a tested version 
of the patch for V6; the patch for 5.5(-x) is available upon request ...

Wolfgang J. Moeller, Tel. +49 551 201516 or -510, <moeller@gwdgv1.dnet.gwdg.de> 
GWDG, D-37077 Goettingen, F.R.Germany            PSI%(0262)45050859008::MOELLER
Disclaimer: No claim intended! | <moeller@decus.decus.de>  <w.moeller@ieee.org>