Path: news.mitre.org!blanket.mitre.org!philabs!newsjunkie.ans.net!newsfeeds.ans.net!news-was.dfn.de!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.idt.net!news.voicenet.com!news.new-york.net!news.columbia.edu!news.cs.columbia.edu!versed.smarts.com!usenet
From: Jerry Leichter <leichter@smarts.com>
Newsgroups: comp.arch
Subject: Re: IA64 Self Virtualizable?
Date: Thu, 20 Nov 1997 15:07:19 -0500
Organization: System Management ARTS
Lines: 98
Message-ID: <34749877.4FED@smarts.com>
References: <64q6l9$q0v@crl.crl.com> <64u9q0$jlu$1@xs155.wins.uva.nl>
NNTP-Posting-Host: just.smarts.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 3.01Gold (X11; I; SunOS 5.5 sun4d)
To: Marcel Beemster <beemster@wins.uva.nl>

| >Does anyone know whether the IA64 instruction set will be 
| >self-virtualizable?
| 
| Ahum. Would anyone be so good as to DEFINE self-virtualizability of an
| instruction set architecture?
| 
| Is it a property of the hardware to be its virtual self?

Mainly.  There's a whole PhD disseration (at Harvard, in the early '70's
I believe, on this subject).  The issues, and even the definitions, can
get subtle.

In one sense, self-virtualization is trivial:  Write, in the ISP of the
processor, a simulator for the full ISP.  Now run the "guest" OS's under
the simulator.  Complete virtualization, complete safety.  Horrible
performance - it's tough to do a simulator that won't average at least 5
real instructions per simulated instruction (i.e., a factor of *at
least* 5 slowdown), and simulating things like I/O get even more
expensive.

To be usable, a self-virtualizing system has to let the hardware handle
the vast bulk of the instructions executed - say, >95% - with no extra
overhead.  One line is often drawn between user and privileged modes: 
Let all user-mode code run on the raw hardware, but simulate all
privileged code (by trapping all instructions that switch to privileged
mode and simulating until a return to user mode).  That's better, but
still expensive - and may not even be workable.  (User-mode code may
have direct read access to memory owned by privileged code.  If there
are constraints on where this memory may appear address range, you may
find it difficult to find a place to "hide" the hypervisor.  In
practice, any good VM system can let you fake this, though it may call
for a lot of map swapping.)

Simulating all the privileged code *still* has way too much overhead to
be really usable.  *Most* privileged code is doing non-privileged
things; it's just that it has access to areas of memory that are
normally out of bounds to user-mode code.  The next step, then, is to
run the privileged code *but in user mode*.  When the hypervisor sees
the guest code try to enter the privileged state, it swaps the memory
mappings around so that the guest's "kernel areas" are now accessible. 
Then it lets the guest continue.  When the guest trys to execute an
instruction that is *really* privileged - access an I/O device, change
the virtual memory mappings - the hypervisor gains control and simulates
the effects.  An ISP gotcha' that can kill you here:  If the instruction
that switches from privileged mode back to user mode isn't privileged,
the hypervisor will have no way of knowing when the guest OS is
returning code to the user.  That will allow user mode code within the
guest to act as if it were part of the guest OS.  Bad news.

A more subtle problem is a non-privileged instruction that lets a user-
mode program determine what mode the machine is in.  The OS will expect
to be told "kernel mode".  (This has real uses, as when an OS routine is
designed to be callable from both user and kernel mode, but with
additional checking in user mode.)  The x86 has one instruction of this
form, making it non-self-virtualizable.  (The "fix" is to make that
instruction privileged.  Then the hypervisor traps it, checks to see
what *simulated* mode the guest is in, and returns the appropriate
value.)

I/O is a big headache, especially on machines that use memory-mapped
device registers:  The hypervisor has to make those pages inaccessible,
trap the accesses, then figure out if they are permitted and what they
should really mean.  The old 360 channel program design, as John Mashey
pointed out in an earlier posting, does well here because the hypervisor
gets essentially full information about the actual I/O to be done in a
single trap, rather than (in the worst case, for simple serial devices)
having to get involved once per character).  BTW, the x86 has the
*potential* for a good solution here, since it's possible to control the
accessibility of I/O devices (at least those that use I/O ports, rather
than memory mapping) on a fine-grained basis.  I forget the details, but
you can set things up so that the innermost ring is the hypervisor, and
the next outer one has access to exactly the I/O devices it is allowed
to manipulate.

> Is it a property of the OS to be virtually someone else?

Generally, true self-virtualization is taken to mean that *any* OS can
be supported transparently.

If you're willing to modify the OS - to ensure that it doesn't use that
x86 instruction that looks at the "wrong" processor state, for example -
the problem is *much* easier.  Of course, then you can't run the "stock"
OS that you would normally use on the raw hardware within a virtual
machine for debugging or what have you.

Practical self-virtualizing systems (of which VM/370 is probably the
only real example, beyond - I'm sure - some experiments here and there)
tended to *support* any OS, but to *favor* OS's that were aware of the
hypervisor and cooperated with it.  (Think about what happens when the
guest and hypervisor both try to do page management independently ...
much, much better for the guest to tell the hypervisor what it's up to.)
The difference in performance can be substantial - that last 1% (or
whatever) needed for full self-virtualization can really cost you.

> Is it is property of the OS that requires hardware support?
> If so, what support?

							-- Jerry