From: SMTP%"@CRVAX.SRI.COM,@UNIX.SRI.COM:lrw!leichter@LRW.COM" 2-SEP-1994 08:50:37.93 To: EVERHART CC: Subj: Re: re: posix Message-Id: <9409020249.AA22260@uu3.psi.com> Date: Thu, 1 Sep 94 22:36:03 EDT From: Jerry Leichter To: INFO-VAX@SRI.COM Subject: Re: re: posix X-Vms-Mail-To: UUCP%"ivax@meng.ucl.ac.uk" Everhart@arisia.gce.com: > When I heard an early presentation on VMS Posix, it was discussed > that on a given piece of iron, unix took ~20 msec. to fire up a new > process, VMS took over 100 msec. (sorry, I forget the figure) and > that VMS Posix would reduce the amount of stuff to be carried and > expected to get to ~70 msec. for fork. [...] Anyone know if there's any truth in the rumour that kernel based threads are coming to VMS ? (Presumably these would make threads 'schedulable entities', allowing threads to be scheduled independantly, and also spread across multiple processors). By my limited understanding, these would be a much better match for fork than subprocesses are. (?). I've heard the rumors, too, from a well-informed source. Don't know when they will arrive, however. Kernel-based threads would not, however, provide an effective replacement for fork(). fork() creates a new address space (which starts off as a copy of the original address space). Multiple threads share a single address space. The "kernel-based" part just has to do with who schedules threads. In a kernel-based implementation, the schedulable entity is the thread, not the process. In a user-mode thread implementation - such as DECthreads - the schedulable entity is the process. That, within that process, a separate scheduler decides which thread to run when the process is given control is of no concern to the operating system, which in fact is generally not even aware that multiple user-level threads exist. There are advantages and disadvantages to each approach. The pluses for a kernel-based design include: - On a multiprocessor, multiple threads can run simultaneously on different processors; - OS level blocking operations can easily be set up to block the thread, not the process. (Since the OS doesn't know there are multiple user-level threads within a process, if one thread does a QIOW, all the threads in the process wait for it to complete.) - Thread priorities can mean something across processes, not just within them. Advantages of user-level threads include: - They can be implemented much more cheaply. Switching from one user-level thread to another takes 10-20 instructions in a reasonable implementation. It's hard to get within a factor of 10 of that cost in a kernel-mode implementation. - User-level threads don't *really* run in parallel, so inter-thread synchronization is much simpler. This (a) is easier to get right, a worthwhile consideration if you don't really *need* true parallelism; (b) is much more efficient. In fact, in many cases a non-preemptive user-level thread implementation is good enough, making synchronization even easier. - User-level threads can be implemented without changing the OS. Since the code that manages them is purely user mode code, it can safely be changed. Thus, user-level threads can be much more flexible. - Various management policy decisions don't need to be made. In a kernel-based thread implementation, you could increase your share of the CPU by simply starting many threads, set up so that they are really all doing pieces of the same thing. In a user-level implementation, this gains you nothing, since it's still your one process that gets scheduled. There are ways to handle this, but they seem complex (and add yet more overhead to a kernel-based implementation). There's actually an intermediate technique, called scheduler activations. In brief summary, there is now a third object, and "activation", that is the schedulable entity. A process has one or more threads, and one or more activations. When the kernel schedules one of the activations in a process, the code that gets control in the process is the user-level thread scheduler, which picks a thread to run. If there's one activation in the process, you get a user-level implementation; if there are as many activations as threads, you get a kernel-based implementation. In between, you get something that trades off the advantages and disadvantages of the two extremes. I think the Solaris lwp (light-weight process) package does something of this general sort. -- Jerry