From: SMTP%"lauri@elwing.fnal.gov" 25-AUG-1994 11:38:13.98 To: EVERHART CC: Subj: Re: sysman problem From: lauri@fndcd.fnal.gov (Laurelin of Middle Earth) X-Newsgroups: comp.os.vms Subject: Re: sysman problem Date: 24 Aug 1994 14:46:21 GMT Organization: Fermi National Accelerator Lab Lines: 33 Message-Id: <33fmft$kks@fnnews.fnal.gov> Reply-To: lauri@elwing.fnal.gov Nntp-Posting-Host: dcd00.fnal.gov To: Info-VAX@CRVAX.SRI.COM X-Gateway-Source-Info: USENET In article , martineau@automatismes.ccr.hydro.qc.ca (alain martineau) writes: > On a cluster with 20 nodes, there is an intermittent problem. One of the > nodes, not always the same, becomes unreachable from all others, but it > sees the other members OK. There is no error message on any node, no > explanation is given. Any pointer or hint ? I would gladly RTFM if I > knew where to start. > thank you > alain martineau > amartineau@nccr.ccr.hydro.qc.ca If the problem is truly SYSMAN, I've seen it many times. Don't know the reason, distant legend reports that it involves filled mailboxes and the like. To fix, you can STOP SMISERVER (from the system account, or use STOP/ID on that process) on the node having the problem, then SET HOST to it (or telnet, or whatever), then restart the SMISERVER process (from the SYSTEM account): $ @sys$system:startup smiserver Various fixes posted here and elsewhere by other people have included commands for emptying the SMISERVER's mailboxes through translation of logical names ($ SHOW LOG SMI$* gives you the names) and TYPEing of those mailboxes followed by CTRL/C. I've never needed to resort to those extremes, just restarting the server was enough. Your mileage may vary. -- lauri /-----------------------------------------------------------------------------\ | Lauri Loebel Carpenter "All that is gold does not glitter, | | lauri@elwing.fnal.gov Not all those who wander are lost..." - JRRT | | #include /* I only speak for myself */ | \-----------------------------------------------------------------------------/