From: Hein van den Heuvel [hein_netscape@eps.zko.dec.com] Sent: Tuesday, November 25, 2003 1:22 PM To: Info-VAX@Mvb.Saic.Com Subject: Re: Disk cluster size question Daryl Jones wrote: > Drew Shelton wrote in message news:<01L3AQMZ4VB2005RHZ@SEMATECH.Org>... > > I'm planning an upgrade from VMS 7.1-2 to 7.3-1, and I'd like to take > > advantage of smaller disk cluster sizes to conserve disk space. Is there > > any reason (such as performance) not to have the smallest cluster size possible? > : > I had a similar discussion on this topic around March 11, 2003 with a > Mr. Hein Vandenheuvel of HP. The discussion went like this: Thanks for remembering! Just for grins.... Here is a perl script to parse DFU output to help guestimate the effect of cluster size on wasted space. It also reports teh number of files that woudl be mapped with a single cluster, making fragmentation impossible for those. Note: the script does NOT deal with 'overallocated' files, that is files which have more than a single cluster between the registred eof and the allocated space. Sample output: $ perl CLUSTER_SIZE.P < dfu.tmp 6168 (6168) Files with 6168 headers 2918870/3293460 11.37% Cluster ____Waste ______ Single_Cluster ------- --------- ------ -------- ----- 1 0 0.00% 1430 23.18% 2 3470 0.11% 1892 30.67% 3 7414 0.23% 2257 36.59% 4 10814 0.33% 2777 45.02% 5 14975 0.45% 3053 49.50% 6 18664 0.57% 3290 53.34% 7 22607 0.69% 3485 56.50% 8 27258 0.83% 3614 58.59% 9 31357 0.95% 3726 60.41% 10 36180 1.10% 3851 62.44% : : 99 508510 15.44% 5511 89.35% 100 513330 15.59% 5514 89.40% # This perl script will process an DFU search output file # to analyze the effect of cluster size on wasted allocaed space. # use a PIPE or: defi/user sys$output dfu.tmp, and DFU SEARCH disk00: # # system$disk00:[directory]filename,ext;vers 1/9 #Primary headers : 6168 #Files found : 6168, Size : 2918870/3293460 while (<>) { if (/(\d+)\/(\d+)$/) { $end=$1; $all=$2; if (/Files found : (\d+)/) { print "\n$1 ($files) Files with $headers headers $end/$all"; printf (" %5.2f%%\n", ($all-$end)*100/$all); } else { $files++; $file{$end}++; } } $headers = $1 if (/headers : (\d+)/); } #foreach $end (sort {$a <=> $b} keys %file) { # printf ("%6d %d\n", $file{$end}, $end); # } print "\nCluster ____Waste ______ Single_Cluster"; print "\n------- --------- ------ -------- -----\n"; while ($c++ < 100) { $waste = 0; $singles = 0; foreach $end (keys %file) { $count = $file{$end}; $singles += $count if ($end <= $c); $w = $c * int(($end +$c -1) / $c) - $end; $waste += $count * $w; # print "$c, $count, $end, $w, $waste\n"; } printf ("%7d %9d %5.2f%% %6d %5.2f%%\n", $c, $waste, $waste*100/$all, $singles, 100*$singles/$files); } Comments? Hein.