From: Hein van den Heuvel [hein_netscape@eps.zko.dec.com]
Sent: Tuesday, November 25, 2003 1:22 PM
To: Info-VAX@Mvb.Saic.Com
Subject: Re: Disk cluster size question



Daryl Jones wrote:

> Drew Shelton <DREW@SEMATECH.Org> wrote in message news:<01L3AQMZ4VB2005RHZ@SEMATECH.Org>...
> > I'm planning an upgrade from VMS 7.1-2 to 7.3-1, and I'd like to take
> > advantage of smaller disk cluster sizes to conserve disk space.  Is there
> > any reason (such as performance) not to have the smallest cluster size  possible?
> :
> I had a similar discussion on this topic around March 11, 2003 with a
> Mr. Hein Vandenheuvel of HP. The discussion went like this:

Thanks for remembering!
Just for grins.... Here is a perl script to parse DFU output to help guestimate the effect of
cluster size on wasted space.
It also reports teh number of files that woudl be mapped with a single cluster, making
fragmentation impossible for those.
Note: the script does NOT deal with 'overallocated' files, that is files which have more than a
single cluster between the registred eof and the allocated space.

Sample output:


$ perl CLUSTER_SIZE.P < dfu.tmp

6168 (6168) Files with 6168 headers 2918870/3293460 11.37%

Cluster ____Waste ______ Single_Cluster
------- --------- ------ -------- -----
      1         0  0.00%   1430 23.18%
      2      3470  0.11%   1892 30.67%
      3      7414  0.23%   2257 36.59%
      4     10814  0.33%   2777 45.02%
      5     14975  0.45%   3053 49.50%
      6     18664  0.57%   3290 53.34%
      7     22607  0.69%   3485 56.50%
      8     27258  0.83%   3614 58.59%
      9     31357  0.95%   3726 60.41%
     10     36180  1.10%   3851 62.44%
      :    :
     99    508510 15.44%   5511 89.35%
    100    513330 15.59%   5514 89.40%


# This perl script will process an DFU search output file
# to analyze the effect of cluster size on wasted allocaed space.
# use a PIPE or: defi/user sys$output dfu.tmp, and DFU SEARCH disk00:
#
# system$disk00:[directory]filename,ext;vers      1/9
#Primary headers : 6168
#Files found : 6168, Size : 2918870/3293460
while (<>) {
  if (/(\d+)\/(\d+)$/) {
    $end=$1;
    $all=$2;
    if (/Files found : (\d+)/) {
      print "\n$1 ($files) Files with $headers headers $end/$all";
      printf (" %5.2f%%\n", ($all-$end)*100/$all);
      } else {
      $files++;
      $file{$end}++;
      }
    }
  $headers = $1 if (/headers : (\d+)/);
  }

#foreach $end (sort {$a <=> $b} keys %file) {
#  printf ("%6d %d\n", $file{$end}, $end);
#  }

print "\nCluster ____Waste ______ Single_Cluster";
print "\n------- --------- ------ -------- -----\n";
while ($c++ < 100) {
  $waste = 0;
  $singles = 0;
  foreach $end (keys %file) {
     $count = $file{$end};
     $singles += $count if ($end <= $c);
     $w = $c * int(($end +$c -1) / $c) - $end;
     $waste += $count * $w;
#     print "$c, $count, $end, $w, $waste\n";
     }
  printf ("%7d %9d %5.2f%% %6d %5.2f%%\n",
    $c, $waste, $waste*100/$all, $singles, 100*$singles/$files);
  }


Comments?
Hein.




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  