<<< HUMANE::DISK$SCSI:[NOTES$LIBRARY]DIGITAL.NOTE;1 >>>
                        -< The Digital way of working >-
================================================================================
Note 5344.8               Looking for clustering info.                   8 of 11
STAR::KLEINSORGE "Fred Kleinsorge, OpenVMS Enginee" 112 lines  19-JUN-1997 10:50
                      -< Not an Official Response (JMHO) >-
--------------------------------------------------------------------------------
    Note that most information on Galaxies has been either high-level, or
    provided under NDA, despite the wide-spread information that has
    appeared in the press and on the network.  So before anyone try's to
    hype the concept to customers, they should check with Bill Hanley, who
    is the Program Manager.
    
    The Galaxy architecture is not O/S specific, but rather the idea and
    the driving force behind it has been OpenVMS.  We're damned-if-we-do,
    and damned-if-we-don't when it comes down to the O/S bigots.  If we can
    get beyond the O/S debate, I think there is a good possibility that the
    future is in Network Computers.  The latest Byte has a good overview of
    the new "NUI" (Network User Interface) systems that are on their way. 
    It suggests the possibility of Application Servers that run web
    servers, databases (Oracle, etc) - and which it is no longer
    interesting to the *user* what the O/S is.  But performance,
    scaleability, and reliability *are* important to both the user, and the
    service provider.  OpenVMS *may* have a second life yet.
    
    Galaxies supposes that with help from the console, a system without
    hardware partitions can be software partitioned, and multiple copies of
    the O/S can be loaded and run concurrently.  Moreover, if the system
    is software partitioned, then the barriers between the O/S Instances is
    "thin" and it's possible to provide things like shared memory sections,
    and migration of resources (like CPUs) between the Instances can occur.
    You will see the term Adaptive Partitioned Multi Processing (APMP) used
    to describe the architecture.
    
    An appealing aspect of this in regards to OpenVMS is that multiple
    Instances can interact as VMSCluster members.  Because clusters has
    been a standard feature of OpenVMS for a very long time, this means
    that the management of a multi-Instance system, is no different in many
    respects to a standard multi-system cluster (which a Galaxy could also
    be a part of).  Galaxies are evolutionary, not revolutionary.
    
    Breaking a system into multiple independent Instances of O/S has some
    distinct advantages over classic SMP.  The most obvious is that SMP
    does not scale well over 8 CPU's.  Scaling SMP up over 8 CPU's is an
    interesting challenge, considering Minsky conjectured that 8 would be
    where SMP would no longer be linear, some 20 years ago.  Hive-like O/S
    implementations try to solve some of the thornier issues (like scheduler
    lock contention) by implmenting multiple schedulers in a single O/S
    domain.  But with a single O/S domain, you have actually increased your
    risk of a system failure by increasing the complexity of the system.
    If we admit up-front, that most system failures are software bugs, it
    is easy to see (although there is some ongoing debate) that
    partitioning the system into multiple O/S domains may allow many
    failures to only effect part of the system, and not bring the entire
    system down.  True fault/disaster tolerance still requires multiple
    systems in either case.
    
    Shared memory is one of the keys to the high performance claims, as is
    the ability to distribute the I/O subsystems around the system.  Shared
    memory used as a communications or lock manager provides *memory*
    speeds, not network speeds.  And while memory channel is fast, it is
    still essentially a high speed communications channel.  The other
    aspect is shared application data - for example, your Database Cache,
    which is now hot, even if you reboot an Instance.
    
    Breaking a system up into multiple O/S domains also can lend itself to
    NUMA architectures, by allowing the system to be broken up along the
    natural performance boundries.  Without the need to implement new
    memory coherency layers for data migration and replication common on
    Hive-like architectures.
    
    Electronic migration of CPUs is way cool.  Load balance without trying to
    figure out how to snapshot an application state.  Just move the CPU to
    where the action is.  Moving a CPU via a GUI is a sexy demo, but just the
    first step in automatic load adjustment.
    
    And scaling of a server can be achieved by hot-swapping in of new
    subsystems and components which may require no shutdown, or may require
    only one or more of the Instances to reboot.  True continuous
    computing.  Buy what you need today, add to it as you grow.  Never
    shutdown the entire system to upgrade.
    
    Now what about the other O/S's?  Well, Microsoft is on record as
    wanting a shared-nothing model for multiple NT instances in a single
    box.  Implement a high-speed communication interface (perhaps with
    shared memory) to talk between Instances.  Most UNIXs are doing
    Hive-like implementations (single domain, multiple schedulers, multiple
    copies of portions of the kernel), and counting on high-performance
    scientific computing to take advantage of parallel processing.
    
    But there is nothing (except console and HAL/Kernel work) that would
    prevent another O/S from running in a software partition, or for a
    system to be designed with a hardware assisted partitioning (to
    minimize the kernel rework).
    
    Can I envision other O/S's in the box, or even participating in a
    Galaxy implementation?  Sure.  I can easily see a VMS customer who
    would love to see a copy of NT in the same box.  We are being careful
    to not preclude it, but at the same time we are not driving the other
    O/S strategies.
    
    OpenVMS is the driving force right now, because by luck, and design,
    and need - Galaxies can be implemented with only a small investment on
    OpenVMS.  Will the other O/S's look at the same technology, I think
    they will (IMHO).  Will they go as far as OpenVMS is willing to go...
    I don't know.  We may in fact end up with the same "Cluster" FUD as
    others do limited partitioning, that share many of the characteristics
    of a Galaxy, but that do not have all of the infrastructure that allows
    OpenVMS to exploit it, just as OpenVMS Clusters is not in the same
    class as what many other people call "Clusters".
    
    In the end, if OpenVMS can run an Oracle 7 Database application
    significantly faster than anyone else in the world - then customers who
    need that performance will have only one place they can get it.  And
    the O/S in this case is not really significant.  It's an application
    engine, with certain attributes regarding performance, scaleability,
    and reliability.