From:	SMTP%"parris@ssdevo.enet.dec.com"  4-AUG-1994 16:54:59.91
To:	EVERHART
CC:	
Subj:	Re: Advice on disk storage?

From: parris@ssdevo.enet.dec.com ()
X-Newsgroups: comp.os.vms
Subject: Re: Advice on disk storage?
Date: 4 Aug 1994 17:26:37 GMT
Organization: Digital Equipment Corporation
Lines: 76
Distribution: world
Message-ID: <31r8cd$9bf@nntpd2.cxo.dec.com>
Reply-To: parris@ssdevo.enet.dec.com ()
NNTP-Posting-Host: ssdevo
X-Newsreader: mxrn 6.18-9
To: Info-VAX@CRVAX.SRI.COM
X-Gateway-Source-Info: USENET


In article <28085725@MVB.SAIC.COM>, ivax@meng.ucl.ac.uk (Mark Iline - Info-VAX
account) writes:
|> > In case you don't get a lot of money, and get stuck with mostly what you have,
|> > consider using StorageWorks RAID Software for OpenVMS to form RAID-5 arrays
|> > of your existing RA disks, so you don't lose data when they fail.  With RAID-5,
|> > you only need one additional disk to protect a set of several disks, compared
|> > with RAID-1 (Volume Shadowing), where you need an extra disk for each one
|> > you need to protect.
|> 
|> What's the performance of RAID-5 (particularly this implementation) like as 
|> compared to an 'un-RAIDed' disk ? (ie if you put n+1 RA disks into a RAID-5 
|> array, how does the array's performance compare to n RA disks ?) How does 
|> this vary with the proportion of reads to writes ?
|> 
|> I seem to remember that write performance of RAID-5 was not too good, 
|> because data would have to be read from several disks in order to calculate 
|> the redundancy information.

There was a good article on RAID performance by Ken Bates in DEC Professional
magazine a few months ago, showing graphs of performance and how it varies with
read/write ratio for RAID Levels 0, 1, 0+1, 5, and JBOD (just a bunch of disks).

In most systems, the I/O loading on disks is unbalanced: Ken's rule of thumb is
that 55% of the I/Os go to 20% of the disks.  These "hot" disks can tend to be a
bottleneck in overall system performance.

For RAID-5 reads, performance with typical OpenVMS applications (which tend to
have fairly small I/O sizes) tends to be significantly better than with regular
disks because the data is striped across the spindles and this results in even
balancing of the load across all the available spindles, similar to the effect
of disk striping (RAID-0), but even slightly better in theory because with
RAID-5 you have one additional spindle working compared with a RAID-0 set.

For RAID-5 writes, performance is lower, because you have to:
  1) read the old data and the old parity from their two separate disks (which
     is done in parallel, but on the average will take about 40% longer than a
     single read, because one of the disks will tend to take longer than the
     other to complete the read, and you have to wait for whichever one is
     slowest)
  2) recalculate the parity
  3) write the new data and the new parity to their two separate disks
Note that only two disks of the set are involved in a given write, not all of
them. 

Our implementation is a bit worse than the above, because we also protect
against "write holes" (where a failure occurs during a write, when one but not
both of the parity or data get written to the disk, resulting in parity which
does not reflect the data, and data corruption could thus result if you lose an
array member and had to reconstruct the data).  We set a bit on the disk in an
area close to the parity to indicate that it is invalid before we try to update
the new data and parity; if both writes complete successfully, we reset the bit
and all is well.  When we bind an array, we quickly scan these bits to
determine if there were any "write holes" and fix them up if we find any.

In our users' experience and our tests, we find that if the read/write ratio is
fairly high, that is, somewhere in the 80%-90% range, performance is the same
or better than just a bunch of disks (JBOD), because the gain on reads tends
to offset or overcome the penalty on writes.

For applications where the read/write ratio is too low and write performance is
critical, we recommend that folks consider a RAID 0+1 array (striped shadow
sets).  Shadowing (RAID-1) only has the roughly-40% penalty of doing two
writes in parallel to unsychronized spindles, and you gain from having two
spindles over which to distribute the reads.  Putting RAID-0 on top of that
balances the load evenly across the shadow sets.

We believe that availability of read caching can help RAID-5 write performance,
because it would tend to keep old parity blocks (which are common to several
data blocks) in cache.  And if write latency is low (as when solid-state disks
are used, which have no rotational latency between the reads and the subsequent
writes of the data and parity, or if non-volatile RAM were available in a
controller for a safe write-back cache), then the write penalty for RAID-5 (and
for our write-hole protection scheme) tends to disappear.
------------------------------------------------------------------------------
 Keith B. Parris, StorageWorks Host-Based Software Team, Digital Equip. Corp.