From: MERC::"uunet!CRVAX.SRI.COM!RELAY-INFO-VAX" 7-JUL-1992 01:52:13.68 To: info-vax@kl.sri.com CC: Subj: Re: new mail notification? In article <9206060339.AA16450@ucbvax.Berkeley.EDU>, WILLIAMST@atcf.ncsc.navy.mil ("Tom Williams") writes... [edited for 80 columns] >I used to have new mail notification when I logged on, but now it's not >notifying me any more. I don't know how long this has been happening, >because I have mail autostart on my VT1300. However, other users have been >asking me about it, so I'm finally getting around to it. I have the >MAIL$SYSTEM_FLAGS logical set to 7, and broadcast is enabled for all classes. >Is there something Im missing? If this is the problem of which I'm thinking, you may find that *all* cluster-wide broadcast messages aren't getting anywhere. The SMISERVER and CLUSTER_SERVER processes are coupled relatively tightly. If one gets wedged, it can wedge the other, and vice versa. The CLUSTER_SERVER processes are the ones responsible for distributing messages around the various systems. If they're wedged, nothing will get through - they'll just be constipated, so that when you finally break them loose, people will start getting broadcasts out the wazoo. Check the SMISERVER and CLUSTER_SERVER processes on all nodes in your VAXcluster environment. They should all be in the HIB state; if any (probably most or all) of them are in LEF, this is probably your cause. To verify that your problem is due to the miscommunication between SMISERVER and CLUSTER_SERVER, try the following commands: $ ANALYZE /SYSTEM SDA> SHOW PROCESS /CHANNEL SMISERVER SDA> SHOW PROCESS /CHANNEL CLUSTER_SERVER Each should only have about two channels to mailboxes; the mailbox unit numbers will probably be adjacent (such as MBA5463: and MBA5464:). If you see lots of channels assigned to mailboxes, with alternating `Busy' flags next to them, you've got this problem. Again, for each pair of mailboxes, the unit number will probably be adjacent. Now, to clear this up, go back to DCL and, for each pair of mailboxes, try copying one to the other: $ COPY MBA5463: MBA5464: Repeat the process until COPY hangs, then CTRL/Y out of it, and try doing it the other way (MBA5464: to MBA5463:). (This is because I forget which way the hangup usually goes - although I *think* it's low unit to high unit, your mileage may vary.) Proceed to the next pair of mailboxes. You may or may not have to do this across all systems in the VAXcluster environment. Somewhere along the way, all those blocked broadcast messages will come spilling forth, and you'll be clean - until the next time it happens. Note that, if CLUSTER_SERVER hangs, various other things may hang as well, such as cluster-wide disk mount operations - and OPCOM messages. When OPCOM clogs, AUDIT_SERVER may not be far behind, and when AUDIT_SERVER hangs, any event that causes an audit message to be sent will hang the process involved. This can be a pretty insidious problem. The Digital CSC has a patch for this (CSCPAT_0274). >Thanks in advance Hope this helps.. >Tom Williams >williamst@atcf.ncsc.navy.mil #ken :-)} Jeratol the Chaotic Coar@Nephi.Enet.DEC.Com | All opinions herein contained, stated or implied, Coar@DECUS.Org | are solely those of the author. And he's fullovem. Coar@Eisner.DECUS.Org | `... it was mine art, ... that made gape the pine Massachusetts, USA | and let thee out.' - Prospero (_The Tempest_)