Article 129411 of comp.os.vms: Path: nntpd.lkg.dec.com!crl.dec.com!crl.dec.com!caen!news.eecs.umich.edu!newsxfer.itd.umich.edu!newsfeed.internetmci.com!howland.reston.ans.net!math.ohio-state.edu!uwm.edu!homer.alpha.net!mvb.saic.com!info-vax From: ivax@meng.ucl.ac.uk (Mark Iline - Info-VAX account) Newsgroups: comp.os.vms Subject: Re: RAID strategies in lieu of Spiralog Message-ID: <95091513222720@meng.ucl.ac.uk> Date: Fri, 15 Sep 1995 13:22:27 GMT Organization: Info-Vax<==>Comp.Os.Vms Gateway X-Gateway-Source-Info: Mailing List Lines: 91 > Given that Spiralog will offer "up to 10 times" write performance > over the existing filesystem, how will RAID strategies be effected? > To give a for instance to get things rolling, suppose: > > - You have the opportunity to RAID your existing AlphaVMS box. > - You want 2 to 3 times write speedup over your existing drives. > - You anticipate starting with 8 Gig and growing to at most > 10 in 5 to 10 years. > - Will be using Digital controller based RAID. A quick RAID summmary, largely lifted from the "Digital Guide to RAID Storage Technology". Raid 0 (striping) providing requests don't generally cross chunk boundaries, read & write request rate scales directly with the number of drives. No cost penalty (in terms of capacity lost), but reliability worse than a single drive. Raid 1 (shadowing) Good realiability, read request rate scales with number of drives, write request rate =< that of a single drive. Raid 5 (data and parity striped across multiple drives) providing requests don't generally cross chunk boundaries, read request rate scales with the number of drives. However, writing generally involves combining the incoming data with existing parity data, so write requests may well involve reading the old data and parity off of two drives, then writing the new data and new parity back to the same two drives. Write performance will suffer. Intelligent controllers may partially work around this, as may careful software control of the patterns written. If you want a write performance increase, (assuming we mean request rate, ie smallish requests), on the face of it, only Raid 0 will help. However, the chances are that you'll be concerned about the storage reliability, so you'll want to combine it with Raid 1 (0+1). To get the 3 fold increase, you'll need 3 times the drives, going to 6 times the drives with shadowing. This may well not be what you want... Spiralog gives a write performance increase, because the writes to disk tend to occur in a 'spiral pattern', which is typically as fast as the disk can go. This benefits with JBOD, Raid 0, 1, 0+1. However, there's an extra benfit with Raid 5. Because Spiralog (rather than the application) has control of what is actually written to the media, and because the writes are generally large due to the re-writing of retrieval information, along with the data, and Spiralog's ability to group writes when working in a write-behind mode, Spiralog can ensure that the writes to the Raid-set are an integral number of stripes in length, and occur on strip boundaries. Hence, Spiralog can potentially overcome Raid 5's write problems (up to 4 operations for every write) and still write at close to the spiral write rate. Since Raid 5 is very good in terms of cost, its read request rate, and it's reliability, this is a big win. However, what about intelligent, caching Raid 5 controllers ? Provided that you build a suitable level of redundancy and battery backup into your controller, you can minimise the risk of operating in write-back mode. On the face of it, controller based write-back caching doesn't do much to improve 'write to media' throughput, which is one of Spiralog's strengths. However, given that applications may well read-modify-rewrite disk blocks, and there's a fair chance that multiple applications may be updating the same disk blocks, a number of optimisations can be applied. If the read data is already in the cache, the write operation has no need to re-read the old data to calculate parity. The parity data may also be in the cache, due to a previous write to the same stripe. While the write operation still requires 2 writes (2 to disks), this goes part way to alleviating the problem. Secondly, if the same disk block is re-written 10 times, only the last update actually needs to make it onto the disk. If the controller can defer its writes, it can make significant savings. Similarly, if it can 'stack up' writes until it can write a complete stripe (effectively writing in Raid 3 mode), it can make further savings. [This might sound unlikely, but it's quite possible if copying sizeable files around, and is a feature of some DEC controllers]. This really leaves it down to Raid 5 with an intelligent controller (ie not host-based), or Raid 5 with Spiralog (either host-based, or a more 'basic' controller-based implementation). Both have their pros and cons, but I suspect that Spiralog's backup will clinch it in many cases. Mark Iline system@meng.ucl.ac.uk Dept Mech Eng, University College, London. UK Read at your own risk.