From:	SMTP%"RELAY-INFO-VAX@CRVAX.SRI.COM" 30-AUG-1994 11:37:55.50
To:	EVERHART
CC:	
Subj:	Re: pool fragmentation utility available via FTP

From: jeh@cmkrnl.com (Jamie Hanrahan, Kernel Mode Systems)
X-Newsgroups: comp.os.vms
Subject: Re: pool fragmentation utility available via FTP
Message-ID: <1994Aug30.004545.4474@cmkrnl>
Date: 30 Aug 94 00:45:45 PST
Distribution: world
Organization: Kernel Mode Systems, San Diego, CA
Lines: 93
To: Info-VAX@CRVAX.SRI.COM
X-Gateway-Source-Info: USENET

In article <stus.14.00152197@pssi.com>, stus@pssi.com (Stu Sjouwerman) writes:
> In article <1994Aug27.144817.4459@cmkrnl> jeh@cmkrnl.com (Jamie Hanrahan, Kernel Mode Systems) writes:
> 
>>From: jeh@cmkrnl.com (Jamie Hanrahan, Kernel Mode Systems)
>>Subject: Re: pool fragmentation utility available via FTP
>>Date: 27 Aug 94 14:48:17 PST
> 
>>In article <stus.8.0014C61A@pssi.com>, stus@pssi.com (Stu Sjouwerman) writes:
>>> [...]
>>> soon and put them in the same FTP directory.  The monitor itself is simply 
>>> looking at he amount of fragments in pool, allocation ratio of memory ( how 
>>> many times per second is it allocated and deallocated), and calculates how 
>>> much CPU is spent doing this) 
> 
>>By the way, Stu, are you aware that the function you're claiming here is flatly
>>impossible to do under pre-6.0 systems?  Provided that you want to include
>>*all* pool allocations, including those that go to the lookaside lists?  (And
>>if you don't, you're hardly giving a complete picture of pool allocation.)
> 
> Well, Jamie, it's not so impossible as it may seem to you. 

and then he says

> Obviously one 
> does not measure the look-aside list allocations, 

That's what *I* said.  

(Grin!  There actually IS a way to monitor the lookaside list allocations, even
though various folks to REMQHI's straight off the lists, without doing nasties
like watchpoints, just by intercepting routine calls... I'll be curious to see
whether Keith or whomever mentions it.) 

> What we are concerned with are the _many_ requests that 
> are _not services out of lookaside lists and wind up with a pool scan.
> This is by the way a known problem, and there is even an entry in DSN-link 
> that explains that is _really_ can be a problem in 6.0 systems. The title is 
> 'OpenVMS Npagedyn fragmentation .... LISTPREPOP.DAT.

I am not disputing that your product can help some systems under some
circumstances.  (To do so would obviously be a silly position for me to take,
as you need only provide *one* counterexample!)  

However I have yet to see a 5.x system with "thousands of holes in the pool"
which wasn't helped by a little twiddling of the various pool parameters, to
allow more requests to be satisfied on the lookaside lists.  

The reason I feel that it is significant that you *aren't* monitoring these
highly-optimized paths to the lookaside lists is that your monitoring tools
then have no way of telling whether your product would really do any better
than simply making the aforementioned parameter changes.  

I'll also grant you (so that you won't have to post another note to mention it)
the advantage of not requiring a reboot to get some of these benefits.  

Since beginning to follow this thread I logged into DECUServe and looked up the
thread on MemoryMaster.  There it is stated that in order to "defragment" the
pool, MM simply scans the pool looking for uselessly small chunks, and
allocates them, taking them out of the free list!  (MM also -- so it says in
the articles on DECUServe -- periodically checks to see if the pieces it has
allocated are next to any freed pieces, ie if by freeing the pieces it owns it
could produce larger pieces of free pool, and frees accordingly.) 

I can certainly envision situations with thousands of little bits of pool ahead
of a few decently large ones; simply taking the little bits out of the pool 
so that the allocator finds the large ones sooner does sound as if it ought to
help.  This method of "defragmenting" is also a fairly cheap operation.  

However there are at least two things that encourage me to have further doubts.

> By the way, the effects of running the tool can quite easily be measured. The 
> Kernel mode and Interrupt stack decrease significantly when non-paged pool 
> gets defragmented, practically always resulting in mode User mode available.

Ah, glad you mentioned this.  One of my "doubt sources" is the EXTREME
difficulty of getting really good before-and-after performance data *when the
monitor utility is running on the system that is being measured*.  I know;
I've done some perf monitor tools.  It isn't at all a straightforward problem. 
(And if it's written by people who think that doing a SETIPL to raise IPL
somehow steers your code to execute on the primary CPU, I have *real* doubts
about how it can have any validity whatsoever...)

The other is that I simply don't like the whole notion of products that are
sold with the promise of encouraging the VMS system manager to understand
*less* about the system.  "Here, don't bother to understand the problem, just
throw a little money our way and we'll make it better."  Oh, I realize that
there's a definite market there -- I've taken support calls from so-called "VMS
system managers" who can't even use a text editor, let alone SYSGEN.  

But I don't have to like it.  And I don't.  

	--- Jamie Hanrahan, Kernel Mode Systems, San Diego CA
Internet: jeh@cmkrnl.com (JH645) Uucp: uunet!cmkrnl!jeh  CompuServe: 74140,2055