Article 152524 of comp.os.vms: In article <4u97f9$ji1@nntpd.lkg.dec.com>, vandenheuvel@eps.enet.dec.com says... > > >In article , Ferguson@mag.aramark.com (Li >nwood Ferguson) writes... >>In this case the file went from a disk with cluster size of 3 to a large raid >>set with cluster size of 11. Turns out the FDL we used had a bucket size of >>6, and *apparently* (though I have not seen this written down) CONVERT will >>build buckets on cluster boundaries, at least sometimes. > >Yes, CONVERT and RMS will start out fresh EXTENDs (and AREAs >are extends) at a fresh cluster boundary >If you are extending a lot, then you are also likely to be >fragmenting a disk, and likely to loose the contiguous-best- >try attirbute if you ever had it. RMS tries to mimimize split >IOs by at least starting out a fresh extend on a cluster >boundary if it did not do this, then the bucket that filled >out the last few blocks in the currnet extend and then >had its remainder in the fresh extend would be garantueed >to require two (or more) IOs due to the split. In a picture Interesting. I went back and did some research into how and why we were doing all this. You are correct, though we did not have too small an allocation in the FDL's, we had none. These were some general purpose routines written about 10 years ago (i.e. I remembered little about the details, though I wrote them). They were intended to specify the general characteristics of files that could vary easily in size by 10x (it was a commercial package), and do a monthly reorg of the file. We "discovered" that the simplest way to stay generic was to not specify allocations and let convert find the size as it built the file. We did that to a scratch disk, then copied the file back where it needed to go, and this tended to get rid of any extents (or as many as were practical given the size and state of the disk). This was simpler than trying to automatically edit in allocations (which we didn't know exactly) and less disruptive to attributes we wanted to maintain than letting EDI/FDL do its often somewhat misguided optimise thing. In fact it worked quite nicely (with the exception of the cluster size issue, which has only become relevant recently with huge disks). It never really connected that if it was extending it every bucket (and I guess it is obvious it would in that case) that it was doing it on fresh clusters. I guess that's fairly obvious. I did some experimenting. Leaving out both allocation and extension gives you one bucket per cluster (+/-, I didn't check ALL of them). Specifying adequate allocation gives no impact of cluster size (well, except the tail end), i.e. it ends up the right size. Specifying extension appears to be a good compromise for our case. For example, on that 160,000 file, a 10k extension caused the result to be nearly the same size but still didn't require a hard coded allocation. Thanks for solving a minor mystery. -- Linwood Ferguson e-mail: ferguson@mag.aramark.com Mgr. Software Engineering Voice: (US) 540/967-0087 ARAMARK Mag & Book Services