Article 8079 of comp.org.decus:
The DECUServe Journal
---------------------
June, 1997
From the Editors' Keyboard . . . . . . . . . . . . . 2
What's inside
Controlling CPU Consumption . . . . . . . . . . . . 3
How to handle "CPU hog" applications
Problem Copying Backup Saveset . . . . . . . . . . . 8
Illegal record on tape-to-disk COPY
Web Page Expiration . . . . . . . . . . . . . . . . 9
Specifying an expiration time for a link
NT 4.0 SP3 and PATHWORKS . . . . . . . . . . . . . 12
NT fix breaks PATHWORKS authentication
The Deadly DELNI . . . . . . . . . . . . . . . . . 14
Pros and (mostly) cons of DELNIs and DEMPRs
Hooking up to ISDN PRI . . . . . . . . . . . . . . 19
What do you need between a router and the PRI?
Laser Printer Error Code . . . . . . . . . . . . . 23
Meaning of a DEClaser 2150 status code
Alpha SMP Performance . . . . . . . . . . . . . . 26
Observations on multiprocessor systems' performance
RAID System Disk? . . . . . . . . . . . . . . . . 29
Is it safe to use a RAID set as system disk?
OSF/1 Disk Woes . . . . . . . . . . . . . . . . . 30
Get "camlogger" errors after swapping disk
About the DECUServe Journal . . . . . . . . . . . 33
Contact Information . . . . . . . . . . . . . . . 35
The DECUServe Journal June, 1997 Page 2
From the Editors' Keyboard
From the Editors' Keyboard
---- --- -------- --------
It's June (well, it still is on this coast), and time for another
serving of DECUServe's best. On the menu this time are such
traditional delicacies as OpenVMS tuning arcana and a disagreement
(civil, of course) over the virtues of certain old pieces of DEC
networking hardware. In a slightly more contemporary vein, we also
have a bit of Web hacking, and an item from the MS_WINDOWS_OS
conference, which emerged from the old WINDOWS_NT conference with
its scope broadened to include other Windows platforms. The "meal"
is rounded off with a generous helping of hardware hints and a side
of Unix; just the thing for a summer picnic! Now, how to run a
network hookup all the way out to the beach...?
* * * * *
The DECUServe Journal June, 1997 Page 3
Controlling CPU Consumption
Controlling CPU Consumption
----------- --- -----------
Abstract:
A discussion of how tuning parameters will (or won't) help control
the impact of CPU-intensive applications on a system.
Participants:
Gus Altobello, John Briggs, David Campen, Dan Esbensen, Linwood
Ferguson, Jack Harvey, Alan Hunt, Mark Lasoff, Bart Lederman, Brian
Tillman.
Conference: VMS
Note 2825.0, 24-Apr-1997
Harvey: Controlling CPU Utilization
-----------------------------------
We have a growing CPU intensive application that is raising some design
issues. The application currently lives on a dozen or so "compute
engines" handling about 120 or so users total. Roughly ten per engine.
Disk I/O is negligible. The current engines are VAXen of about 32 VUPs
each. The users are using X-windows and the engines are generating the
displays.
This is growing, and we aren't sure how best to grow it. The
application is designed to run on any number of compute engines. The
problem is that one of the ten users per engine can issue commands that
will suddenly use much of the engine CPU power, blocking other users.
One approach might be to reduce the total number of engines by
replacing them with a few (2-3) Alphas of about 400 VUPs each. The
idea is that a CPU devourer request from one user won't affect other
users on the same engine as much, or at least for as long.
Another approach is to use 120 much smaller VAXen (or maybe PCs),
assigning only one user per engine. This would prevent one user from
stealing CPU power from other users. The user who issues the CPU
devouring request will get lousy response, but other users on other
engines will still be getting good response.
Both approaches are expensive. Is there any way we can use VMS to
bound the amount of CPU that a single user can gobble? To satisfy
those who favor one small engine per user, can we limit CPU use by a
single process to, say, ten percent of the total available?
I'm interested not only in ideas on how to do this, but in thoughts on
the choices between the two different approaches described above.
The DECUServe Journal June, 1997 Page 4
Controlling CPU Consumption
Note 2825.1, 24-Apr-1997
Campen:
--------
> application is designed to run on any number of compute engines. The
> problem is that one of the ten users per engine can issue commands that
> will suddenly use much of the engine CPU power, blocking other users.
Blocking for how long. If all the processes are running at the same
base priority and this base priority is <= 16 then I don't understand
why normal VMS scheduling is not giving all users equal access to the
CPU; if you have ten users each running a compute intensive process
then each should get 10% of the CPU.
Note 2825.2, 24-Apr-1997
Altobello: Some thoughts from a fading brain...
-----------------------------------------------
I was wondering the same as David.
What I presume is happening is you have several folks doing little
graphs or something, and one guy kicks off his big compute load. The
system gets sluggish then (I've seen this) and the others get
much-reduced responsiveness.
I'm losing my touch with VMS, but wasn't there a way to tune the amount
of CPU boost one received for I/O? If you allow I/O to give a greater
boost than normal, perhaps you can make up for the CPU hog.
I've also, in the past, played with the quantum and such, making it so
that the CPU is released more often.
What you appear to be asking is "how do I throttle CPU-bound
processes", and one way to do that is limit their CPU access and the
other is to make I/O-bound processes get higher priorities faster.
Finally, if your application knows it's going into a CPU-intensive
mode, you could possibly have it drop its priority at that time. If
you control the apps on the box, this may make things seem "fairer"
without a lot of effort.
Just some (fading/obsolete) thoughts,
-gus
Note 2825.3, 24-Apr-1997
Campen:
--------
If the case is as Gus describes and quantum is set at its default value
of 20 (=200 ms) then reducing quantum to perhaps 5 should make the
system seem less sticky. Reducing quantum will mean a little more CPU
time is lost to overhead but the default of 20 was chosen a long time
ago when most machines were 1 VUP.
The DECUServe Journal June, 1997 Page 5
Controlling CPU Consumption
> I'm losing my touch with VMS, but wasn't there a way to tune the amount
> of CPU boost one received for I/O? If you allow I/O to give a greater
> boost than normal, perhaps you can make up for the CPU hog.
I don't known of any way to vary the amount of priority boost a process
receives for I/O completion. Non-terminal I/O completion gives the
process a priority boost of 2, terminal output completion a boost of 4
and terminal input completion a boost of 6. A non-zero value of
priority_offset would negatively affect the benefits of a priority
boost so Jack might want to check that this is set to its default value
of zero.
Note 2825.4, 25-Apr-1997
Harvey: Maybe I don't understand what to expect
-----------------------------------------------
Having the application lower its priority when it starts a big CPU
gobble might be possible in the long term. It would require a contract
negotiation, however. :-(
The SYSGEN value of PRIORITY_OFFSET is zero, so that's not a problem.
However, David raises a point. Why, under normal VMS scheduling, does
the hog process block other users? Yet, as Gus has seen, the blocking
effect is real.
Note 2825.5, 25-Apr-1997
Lederman: I'm going to suggest the manuals.
-------------------------------------------
There probably are ways to adjust the system, but I would not go in and
adjust QUANTUM without a good understanding of what will happen.
May I suggest that you look at the "OpenVMS Performance Management"
manual, particularly Chapters 9 and 13, and the decision trees in
Appendix A? I think you will find useful information there. There is
also a feature which appears to be little used in VMS called Class
Scheduling that allows the system to react differently to some classes
of programs than to others. There should be a sample program in
SYS$EXAMPLES:CLASS.C that says something about it.
Note 2825.6, 25-Apr-1997
Briggs: Dynamic priority reduction
----------------------------------
I have two suggestions.
1. Enable the class scheduler. Put each process in its own class
and give the class a maximum of 50% CPU utilization.
The DECUServe Journal June, 1997 Page 6
Controlling CPU Consumption
Unfortunately, from what I remember of the class scheduler implementation,
this will be bursty -- your gobbler will get lots of CPU for the
first half of each sample interval and then get chopped off for the
last half.
2. There is a piece of freeware that we found at one point. It monitors
CPU utilization and imposes a dynamic priority reduction on CPU-intensive
processes. It's been a while and I have no idea where to find it.
But it's not a tough wheel to re-invent.
This actually worked pretty well and kept the system responsive without
killing gobbler processes utterly.
Note 2825.7, 25-Apr-1997
Tillman: ForWords shows...
--------------------------
3RD_PARTY_VMS_SOFTWARE topic 10 contains some discussion on this. In
particular, 10.24 speaks about lowering quantum and the effect it can
have on overhead.
Note 2825.8, 25-Apr-1997
Lasoff: Use DCL
---------------
Hi Jack. A quick solution would be to write a DCL procedure (or
program) that wakes up every so often, checks all processes on the
system, and if it notices ANY user running program XXX, then it does a
$SET PROCESS/PRIORITY=0/ID=pid.
You may need to consider raising that pid's priority at a later time if
the user stays logged-in after running the CPU-intensive application.
This is not elegant, but it will prevent this process from taking over
the system even with the CPU priority boosts of VMS.
Note 2825.9, 25-Apr-1997
Ferguson: Quantum -- try it
---------------------------
> However, David raises a point. Why, under normal VMS scheduling, does
> the hog process block other users? Yet, as Gus has seen, the blocking
> effect is real.
It's real, but I'm not sure why you see it as unexpected. A lot
depends on what the other users are doing. If they are doing lots of
things that give up the CPU (e.g. keystrokes) you tend to get something
like 200ms for the hog, 1ms for A (because it lets go), 1ms for B
(same), 200ms for the hog. Each time through you wait for the hog
before you give the process its 1ms, but then it lets go and doesn't
consume its share of quantum. The hog always does.
The DECUServe Journal June, 1997 Page 7
Controlling CPU Consumption
200ms is a long, long time to hold a CPU if a lot of people are
waiting. Get a couple processes per CPU that will hold it the whole
200ms and interactive response gets dirt slow. Someone mentioned older
vaxen, but also bear in mind batch work use to be more important that
crisp interactive response (or at least more than now -- people are
spoiled and expect instant gratification -- I use to remember when a
785 seemed lightning fast).
Quantum is relatively harmless, drop it way on down and see what you
think. Stick it up high to get a real feel for the range of effects.
You can change it dynamically from moment to moment.
If you do start changing priorities of the COM processes, One other
thing you might look at if you find long compute queues is to adjust
PIXSCAN. It's default of 10 looks through only a few processes for
ones COM[o] to boost. Having this too low relative to the number of
total processes can let one high priority COM process never get
pre-empted. It's kind of like the I/O boost of priority on completion,
to give others at least a chance even if their base priority is lower.
Note 2825.10, 26-Apr-1997
Hunt: Look at CLASS Scheduling
------------------------------
I would agree with the earlier advice on looking up class scheduling.
This is somewhat new so you won't hear much about it. I believe it may
do almost exactly what you are asking for by ensuring everyone gets a
fair share of the system. You can tailor it and there are hooks for
you to set up a custom schedule. Digital may have some other packages
to help with this as well.
Note 2825.11, 27-Apr-1997
Esbensen: Dynamic Load Balancer is built to do just what you are asking...
--------------------------------------------------------------------------
Hello,
Our Dynamic Load Balancer is specifically designed to make sure that
all users get a "fair shake" at the CPU...and to even out "bursty"
response times.
More information can be seen at:
http://www.ttinet.com/
Dan Esbensen
Designer of Dynamic Load Balancer...
The DECUServe Journal June, 1997 Page 8
Problem Copying Backup Saveset
Problem Copying Backup Saveset
------- ------- ------ -------
Abstract:
For some of you, this next short item may fall under the category of
"I knew that" -- but sometimes a little reminder about those obscure
little gotchas is in order. So how many potential problems can you
think of when using the COPY command to move a BACKUP saveset from
tape to disk?
Participants:
Arnold De Larisch, Linwood Ferguson, Terry Kennedy, Larry Kilgallen.
Conference: VMS
Note 2821.0, 13-Apr-1997
Ferguson: Copy backup saveset to disk, illegal record?
------------------------------------------------------
I thought it was always possible to copy VMS Backup savesets from tape
to disk by just mounting the tape and using COPY (+/- those continued
on second tapes).
I've tried it a couple times lately on a new TZ887 drive (not sure if
that's related) and I get thebelow. I've done this with multiple tapes
(though similar in format). In all cases the tape reads fine if I do a
BACKUP command to read it. I've tried it with and without the /BLOCK.
What am I missing?
PS. These were created with
backup/image device magtape:arc#/save /block=65534
$ mount/over=id magtape/block=65534
%MOUNT-I-MOUNTED, ARC mounted on _$1$MUA0: (HSJ003)
$ reca cop
$ copy magtape:[000000]*. sys$scratch:/log
%COPY-E-OPENIN, error opening $1$MUA0:[]ARC3.;1 as input
-RMS-F-IRC, illegal record encountered; VBN or record number = 0
%COPY-E-OPENIN, error opening $1$MUA0:[]ARC4.;1 as input
-RMS-F-IRC, illegal record encountered; VBN or record number = 0
%COPY-E-OPENIN, error opening $1$MUA0:[]ARC5.;1 as input
-RMS-F-IRC, illegal record encountered; VBN or record number = 0
%COPY-E-OPENIN, error opening $1$MUA0:[]ARC6.;1 as input
-RMS-F-IRC, illegal record encountered; VBN or record number = 0
The DECUServe Journal June, 1997 Page 9
Problem Copying Backup Saveset
Note 2821.1, 13-Apr-1997
Kennedy: Recordsize problem
---------------------------
Isn't the maximum RMS recordsize for disk records 32K?
Note 2821.2, 13-Apr-1997
Ferguson: That sounds reasonable
--------------------------------
Now that could be. I just kept staring at the "OPENIN" part.
If I can ever get this whole tape restored I'll experiment (it's got a
ba-zillion little files, and takes foreever to restore, but isn't all
that large, so I was trying to get each saveset to disk and restore
them separately).
Note 2821.3, 14-Apr-1997
De Larisch: VMS Copy as a record limit of 32256
-----------------------------------------------
> I thought it was always possible to copy VMS Backup savesets from tape
> to disk by just mounting the tape and using COPY (+/- those continued
> on second tapes).
No it's not ... the LARGEST block size that VMS Copy is 32256.
You may want to use SAVESET manager to rewrite the tapes with a smaller block
size.
Note 2821.4, 14-Apr-1997
Kilgallen: Another limitation on COPY
-------------------------------------
Although not related to the problem you encountered, the COPY method
also requires that there be no retries in the writing of blocks on
the tape. The only way I know to guarantee that is to make use of
the /INTERCHANGE qualifier (since it was created to facilitate SDC
duplication), but that has the (potentially) undesired side effect
of preventing the transfer of ACL information.
Web Page Expiration
--- ---- ----------
The DECUServe Journal June, 1997 Page 10
Web Page Expiration
Abstract:
Recently, DECUServe has been the site of a fair amount of
Web-related activity, though some of you may not have noticed. A
small group of dedicated volunteers have been working on a project
to make some of DECUServe's rich content available on the Web. This
next exchange concerns one of the "details of implementation" that
inevitably appear in such undertakings.
Participants:
Charlie Byrne, Bob Hassinger, Lynda Peach, Don Vickers.
Conference: WWW
Note 119.0, 10-Mar-1997
Vickers: HTML: Web page expiration options
------------------------------------------
Is there a way to specify an expiration time for a page? I seem to
recall a META tag for EXPIRE but cannot find it.
My goal is to implement some way to help clients of the DECUServe Web
pages to get the latest copies of the pages. The various pages are
generated on a irregular basis. The /conferences/index.html top level
page is generated every few hours and each conference's pages are
generated when a change is found in that conference. In some cases
this is in a few hours and in other cases virtually never.
One approach would be to set some expected 'expiration' time on each
page as it is generated. Another might be to force the pages from the
client and server caches somehow.
We do not wish to use cookies. Some of the clients have made it clear
that cookies are not acceptable and some clients use browsers without
cookie support.
Thanks for any ideas and suggestions,
don
Note 119.1, 10-Mar-1997
Byrne: HTTP for expire, maybe not from HTML
-------------------------------------------
This is from Lincoln Stein's page,
http://www-genome.wi.mit.edu/ftp/pub/software/www/
It may refresh your memory:
Creating the HTTP Header
Creating the Standard Header for a Virtual Document
The DECUServe Journal June, 1997 Page 11
Web Page Expiration
print $query->header('image/gif');
This prints out the required HTTP Content-type: header and the
requisite blank line beneath it. If no parameter is specified, it
will default to 'text/html'.
An extended form of this method allows you to specify a status code and
a message to pass back to the browser:
print $query->header(-type=>'image/gif',
-status=>'204 No Response');
This presents the browser with a status code of 204 (No response).
Properly-behaved browsers will take no action, simply
remaining on the current page. (This is appropriate for a script that
does some processing but doesn't need to display any
results, or for a script called when a user clicks on an empty part of
a clickable image map.)
Several other named parameters are recognized. Here's a contrived
example that uses them all:
print $query->header(-type=>'image/gif',
-status=>'402 Payment Required',
-expires=>'+3d',
-cookie=>$my_cookie,
-Cost=>'$0.02');
-expires
Some browsers, such as Internet Explorer, cache the output of CGI
scripts. Others, such as Netscape Navigator do not. This
leads to annoying and inconsistent behavior when going from one browser
to another. You can force the behavior to be
consistent by using the -expires parameter. When you specify an
absolute or relative expiration interval with this parameter,
browsers and proxy servers will cache the script's output until the
indicated expiration date. The following forms are all valid for
the -expires field:
+30s 30 seconds from now
+10m ten minutes from now
+1h one hour from now
-1d yesterday (i.e. "ASAP!")
now immediately
+3M in three months
+10y in ten years time
Thursday, 25-Apr-96 00:40:33 GMT at the indicated time & date
When you use -expires, the script also generates a correct time stamp
for the generated document to ensure that your clock
and the browser's clock agree. This allows you to create documents that
are reliably cached for short periods of time.
The DECUServe Journal June, 1997 Page 12
Web Page Expiration
CGI::expires() is the static function call used internally that turns
relative time intervals into HTTP dates. You can call it
directly if you wish.
Note 119.2, 10-Mar-1997
Peach: Date must be GMT.
------------------------
In addition to Charlie's note ....
The Expires header is used by the proxy server as a mechanism to keep
caches up-to-date. Example:
The proxy server discards the document at the indicated time. There is
one important note -- the exact format of the dat is specified by the
standard, and the date MUST ALWAYS be in Greenwich Mean Time (MGT).
The above is from Webmaseters Expert Solutions.
Will be very interested to know how this works Don.
Lynda
Note 119.3, 10-Mar-1997
Hassinger:
-----------
I had intended to answer that question a few days ago - guess I did
not.
Yes, you can put an expires in the headers (take care about the
particular CGI scripting support you are using - the code shown might
not quite work on VMS executing DCL for example - every platform (and
server) provides different CGI support (although a good many provide
pretty similar support)
>
And this points out the potential use of a META tag to effectively
embed the header information within the HTML code. The question WRT
how well you can depended on the META http-equiv working. My
impression is that it may not be fully supported by everything. I am
not sure what its standardization status is.
The DECUServe Journal June, 1997 Page 13
NT 4.0 SP3 and PATHWORKS
NT 4.0 SP3 and PATHWORKS
-- --- --- --- ---------
Abstract:
A report of a problem with PATHWORKS 4 SMB authentication after
applying an NT service pack, how to fix it, and how to recover when
the published fix doesn't work.
Participants:
Kevin Angley, Paul Flaherty Jr.
Conference: MS_WINDOWS_OS
Note 285.0, 16-May-1997
Angley: NT4.0 SP3 breaks Pathworks 4 SMB authentication
-------------------------------------------------------
NT 4.0 SP3 having now been released (
ftp://ftp.microsoft.com/bussys/winnt/winnt-public/fixes/usa/nt40/ussp3),
I installed it on my NT 4.0 workstation. Now, I am unable to connect to
Pathworks 4 file services using the \\node\share%user construct. The
error says that I am unable to login from this workstation.
The release notes indicate changes in SMB authentication and reference
two knowledge base articles. Both article numbers they give (Q161372
and Q166730) are wrong.
Anyone know what registry hack is necessary to fix this problem?
Note 285.1, 20-May-1997
Angley: Registry entry doesn't fix it for me
--------------------------------------------
The Knowledge Base article finally became available. It says to hack
the registry as follows:
Run Registry Editor (Regedt32.exe).
From the HKEY_LOCAL_MACHINE subtree, go to the following key:
\system\currentcontrolset\services\rdr\parameters
Click Add Value on the Edit menu.
Add the following:
Value Name: EnablePlainTextPassword
Data Type: REG_DWORD
Data: 1
Click OK and then quit Registry Editor.
Shut down and restart Windows NT.
I did so, and rebooted. It did not correct the problem. DEC DSN says
that this should have corrected the problem. Anyone else seeing this?
The DECUServe Journal June, 1997 Page 14
NT 4.0 SP3 and PATHWORKS
Note 285.2, 22-May-1997
Flaherty: Same problem with SAMBA
---------------------------------
I had the same problem connecting to SAMBA for VMS, but the above hack
fixed it.
Note 285.3, 23-May-1997
Angley: Works for most
----------------------
Apparently this fix works for most people. On a particular workstation
of mine, it did not fix it. I was, however, able to fix it by
recovering the rdr.sys file from the uninstall directory.
The Deadly DELNI
--- ------ -----
Abstract:
We now date ourselves when we admit to having experience with
certain kinds of networking hardware -- and DELNIs and DEMPRs are
surely on that list. The following discussion concerns a plan for
replacing some "vintage" equipment, and even (perhaps surprisingly)
flushes out some defenders of the old boxes.
Participants:
Gus Altobello, Linwood Ferguson, Jack Harvey, Ken Johnson, Terry
Kennedy, Milton Lopez, Norm Raphael, Brian Tillman.
Conference: DEC_NETWORKING
Note 1290.0, 8-May-1997
Lopez: Die, DELNI, die!
-----------------------
Will the following configuration work (DS200 = DECserver 200)?
|
DS200----T-----| T = AUI - BNC transciever
DS200----T-----|
DS200----T-----|
|
DEMPR----T-----| <- Thinwire backbone
The DECUServe Journal June, 1997 Page 15
The Deadly DELNI
|
uVAX-----------|
|
Cat5 hub-------| This will be added as part
| of a migration to Cat5
Just for background I still have the DS200's hanging off of a DELNI
(which I would like to get rid of) like this:
H4000 H4000
--------O---------------O------ Thickwire backbone
| |
| |
| |
uVAX DELNI
| | |----DS200 (Some terminals
| | |----DS200 still in use)
| |------DS200
| .
| .
DEMPR
| | |----
| | |---- Thinwire (lots of
| |------ segments PCs)
|
Note 1290.1, 8-May-1997
Johnson: Looks good
-------------------
The proposed new configuration looks much better than the old. Our
configuration is smilar to your new configuration, except that our multiport
thinwire repeaters are from Allied Telesis, not Digital.
Note 1290.2, 9-May-1997
Harvey: Go for it
-----------------
The new configuration looks good, but I don't see it as "much better"
than the old, which looks good to me, too. What's wrong with it, Ken?
Note 1290.3, 9-May-1997
Kennedy:
---------
You can't hang repeaters/hubs off DELNI's. Repeat after me: "DELNI's are
an abomination". They're "multiport AUI fanout devices", a category of
device that isn't mentioned in the 802.3 specs.
The DECUServe Journal June, 1997 Page 16
The Deadly DELNI
Note 1290.4, 9-May-1997
Johnson: Our experience
-----------------------
We used to have cascaded DELNIs, with a multiport repeater hanging off of the
top-level DELNI. When we switched to a configuration with the MPR as the
backbone, our Ethernet network was more reliable. Later, when we needed more
Thinwire legs, we went to a Thinwire backbone, with MPRs off of that, similar
to the proposed new configuration.
Note 1290.5, 11-May-1997
Harvey: Bean counting engineering
---------------------------------
> You can't hang repeaters/hubs off DELNI's.
Nonsense. I've done it for years. So what if they aren't mentioned in
the specs? Why does the spec make the electrons dance nicer?
Note 1290.6, 11-May-1997
Kennedy:
---------
There are a very few number of specific circumstances (one?) where you can
do it. It involves an EtherCork (DEC test adapter) in the global port, the
DELNI in global mode (but not connected to anything upstream) and a single
level of Ethernet (not 802.3) repeater/hub devices connected.
> So what if they aren't mentioned in
> the specs? Why does the spec make the electrons dance nicer?
I find this very odd coming from you, given the "must absolutely be a
supported configuration" message you're sending about your environment
elsewhere on DECUServe.
I doubt lima beans are mentioned in the spec either, and using them will
break your network 8-) Seriously, the spec states what designers have to
do to be interoperable, and what sorts of configurations are valid. You
can occasionally break the rules and get away with it.
What makes the DELNI so evil is that it will often appear to work just
fine, until traffic grows or some apparently unrelated change is made
elsewhere on the network. Also, there are a lot of subtle issues with the
DELNI that don't get factored in (for example, what's the equivalent tran-
sceiver cable length of a DELNI?) that can cause you to unknowingly build
an invalid configuration. Again, that's beyond the other aspects of the
DELNI that make them unsuitable.
The DELNI was a good solution for its time - when thickwire was the only
Ethernet medium and transceivers cost hundreds of dollars. It made sense
to use a DELNI as a "virtual backbone". For that time and those slower
systems, it made lots of sense and saved money. When the attached systems
got faster, when the backbone de-virtualized, and when the world switched
The DECUServe Journal June, 1997 Page 17
The Deadly DELNI
to 802.3, the DELNI was no longer a good solution. For a trivial amount of
money, you can dump the DELNI - just get up to 8 $29 (list) Allied Telesyn
MX20 TP transceivers, 8 $3 CAT 5 cables, and a $200 twisted pair hub.
Note 1290.7, 11-May-1997
Harvey: Hey, lighten up!
------------------------
>Seriously, the spec states what designers have to
>do to be interoperable, and what sorts of configurations are valid.
Terry, this is just not true. The spec gives bean counter rules for
people who can't (or don't want to) use their engineering common sense.
The electrons don't know about the spec.
>You can occasionally break the rules and get away with it.
^^^^^^^^^^^^
More nonsense. It depends on the rule. You can't get a transceiver to
work without applying power, I agree. But how about VSWR? How about
cable length? How about the number of taps? How about the number of
repeaters? How about the spacing between taps? How about cable
characteristic impedance? How about wire gauge for 10BaseT? You know
perfectly well these rules can all be violated. You have probably
violated every one yourself.
I agree the rules are good guides, and prudent engineering should
attempt to follow them. However, engineering is also getting the job
done. You seem to advocate (What makes the DELNI so evil...) junking
lots of perfectly good hardware because you have had a problem with it.
Is that good advice?
> I doubt lima beans are mentioned in the spec either, and using them will
>break your network 8-)
Beans? Galileo fell for that, too, so you're in good company. :-)
Note 1290.8, 11-May-1997
Ferguson: We're trashing our DELNI's, want some?
------------------------------------------------
> Terry, this is just not true. The spec gives bean counter rules for
> people who can't (or don't want to) use their engineering common sense.
> The electrons don't know about the spec.
Jack, perhaps you have a very unique environment where everyone who
must maintain your network understands these from an engineering
standpoint well enough to second guess the rules.
We don't. And while I've not had nearly the experience with DELNI's
Terry has (or anything else probably), most of our problems have come
from 2nd or 3rd order changes, where someone "cheats" and does so
intelligently, then along comes someone else and does something that
The DECUServe Journal June, 1997 Page 18
The Deadly DELNI
appears perfectly valid and everything breaks. E.g. pulling out the
loopback connector on a standalone DELNI and switching it out of global
mode "because I needed a loopback connector and obviously it wasn't
serving a purpose".
Performance and heartbeat issues are another area.
Note 1290.9, 13-May-1997
Altobello: DELNI/DEMPR works fine, if you can handle the "rules"
----------------------------------------------------------------
Here we have used the DELNI/DEMPR combination extensively over the
years, and it has worked well for us. Yes, if you don't disable
heartbeat (SQE) you can have problems. Yes, if you cascade the DELNIs
you have problems.
If you stick a scissors into a wall socket, you have problems, too.
There are specific rules you must follow to get DELNI/DEMPR combos to
work, and in my experience, and in the experience of others of my
colleagues here, it has been a perfectly workable combination.
As far as getting "strange problems that are hard to diagnose", our #1
candidate for that was having a single Ethernet backbone. This allowed
any transceiver that went wacky to drop the whole backbone and was hard
as the dickens to troubleshoot. That configuration was corrected many
years ago, five or more. And since the collapsed backbone that was
used then had numerous bridged Ethernet segments, we pulled the DELNIs
and connected the DEMPRs right up.
I suspect there are still DELNI/DEMPR setups somewhere in the network,
though we have long since gone with other configurations.
But the pair did work, and when it didn't it was usually either someone
didn't follow some simple rules, or it was the more general problem of
having a broadcast medium hung all over the place.
Note 1290.10, 15-May-1997
Lopez: As I was saying ...
--------------------------
Ok, Ok, so the DELNI is not demon-possesed. May ask another question?
A few notes back in this thread someone talked about conecting a terminal
server directly to a hub using a transciever. This sounds better than my
original suggested config, although it does take up ports on the hub. So,
will this work:
DECServer200 -- Transciever -- Cat5 -- Hub port ?
The DECUServe Journal June, 1997 Page 19
The Deadly DELNI
Note 1290.11, 15-May-1997
Raphael: I think so
-------------------
I believe we are doing this now quite comfortably.
Note 1290.12, 15-May-1997
Kennedy: Yup
------------
Yes. If you turn off heartbeat on the transceiver (most non-DEC ones I've
seen come this way by default) you'll see a warning on the terminal server
(in "show server status" - something like "self-test status: 08-00-00" as I
recall) because the server is Ethernet II, not 802.3. This is harmless and
doesn't affect anything (or you could always enable heartbeat).
Note 1290.13, 19-May-1997
Lopez: Cool!
------------
Thanks again. I can now justify my DECUServe subscrition renewal ... ;)
Note 1290.14, 20-May-1997
Tillman: Another symptom
------------------------
You'll also notice that the green light blinks instead of being a
steady green when "heartbeat" is disabled.
Hooking up to ISDN PRI
------- -- -- ---- ---
Abstract:
Suppose you have a router with an ISDN PRI (Primary Rate Interface)
module, and a connection installed by your local telco -- which is
to say, a jack in the wall. What do you need to connect things up?
A cable? A CSU? And what exactly does a CSU do, anyway?
Participants:
Harris Berkowitz, Linwood Ferguson, Terry Kennedy.
Conference: DEC_NETWORKING
The DECUServe Journal June, 1997 Page 20
Hooking up to ISDN PRI
Note 1289.0, 1-May-1997
Berkowitz: PRI with CISCO 4700
------------------------------
What is generally used in between the ISDN PRI module on a Cisco 4700 and
the telco jack?
My Cisco vendor first recommended an AT&T (Lucient) CSU in the $2200 range.
They said that there's a less expensive Adtran solution but haven't heard
back with the details yet.
Anyone else doing PRI with Cisco?
Note 1289.1, 2-May-1997
Kennedy: Replace your vendor
----------------------------
Just a cable, Cisco part number CAB-7KCT1DB15, $100 list price. You can
make up the cable yourself (we do) - all you need is a DB-15 and a Cat 5
patch cable. Whack one end of the Cat 5 cable off, put the white/blue pair
on DB-15 pins 1 and 9, and the orange/white pair on DB-15 pins 3 and 11.
Polarity (which pin gets the white wire) doesn't matter, though you may
have to swap the 1/9 and 3/11 sets (moving white/blue to 3/11 and so forth).
> My Cisco vendor first recommended an AT&T (Lucient) CSU in the $2200 range.
> They said that there's a less expensive Adtran solution but haven't heard
> back with the details yet.
I think you need a new vendor who can at least read the Cisco catalog.
By the way, you can get T1 CSU's (not that this configuration needs one)
for $800 or so - $100 less if you don't need fractional T1.
> Anyone else doing PRI with Cisco?
No, but we're using the MIP (the Cisco 7xxx version of your 4xxx card)
for T1's which uses identical hardware and cabling as ISDN PRI.
Note 1289.2, 7-May-1997
Ferguson: Really confused, need education before new vendor
-----------------------------------------------------------
Terry, can you elaborate a bit. There's a bit more to this story.
I looked in the Cisco CD catalog, and it sure appeared to show that you
needed a CSU (it just said CSU, nothing about DSU).
The vendor that wanted $2200 for a CSU is what first set bells ringing,
since we were already getting Crays for far less than that, and had
several sitting around. Except those were CSU/DSU's if I'm using the
right terminology.
After your note, I asked Cisco's TAC, telling them we had some
The DECUServe Journal June, 1997 Page 21
Hooking up to ISDN PRI
seemingly conflicting info whether we just needed a cable or CSU or
both. Their very terse response was that we needed a CSU (nothing
about any particular types).
Maybe what we need is an education (and maybe our vendor does as well,
but I can't help him). My limited understanding was that the CSU and
DSU are really separate things that often come in the same box, and
that the PRI interface on the Cisco takes care of the DSU
part. But not sure about the CSU part, as I'm a bit weak on what
exactly a CSU does.
Then there's the simple issue that the Cisco catalog expects a 15pin to
15pin cable from the PRI interface to the CSU, and any CSU/DSU we have
has a V35 cable -- another reason I'm assuming that we need a CSU-only
not a CSU/DSU.
Can you or someone elaborate a bit on how this stuff all works, and (if
it really is) how a CSU can be optionally replaced with a cable? Am I
right in my highly vague idea of the CSU/DSU portions (and might
someone give a better definitino of what each portion actually does)?
Note 1289.3, 8-May-1997
Kennedy: Gory details explained
-------------------------------
> Terry, can you elaborate a bit. There's a bit more to this story.
Sure.
> I looked in the Cisco CD catalog, and it sure appeared to show that you
> needed a CSU (it just said CSU, nothing about DSU).
That is indeed what the catalog says. However, I've never used a CSU with
these and nobody (among the other ISP's I know) has either. More on this
below.
> The vendor that wanted $2200 for a CSU is what first set bells ringing,
> since we were already getting Crays for far less than that, and had
> several sitting around. Except those were CSU/DSU's if I'm using the
> right terminology.
Right - a DSU adapts an interface (for example, V.35) to the physical trans-
mission channel (like a T1), while a CSU provides isolation and supports
diagnostic functions like loopback and the T1 performance monitoring counters.
> After your note, I asked Cisco's TAC, telling them we had some
> seemingly conflicting info whether we just needed a cable or CSU or
> both. Their very terse response was that we needed a CSU (nothing
> about any particular types).
>
> Maybe what we need is an education (and maybe our vendor does as well,
> but I can't help him). My limited understanding was that the CSU and
> DSU are really separate things that often come in the same box, and
The DECUServe Journal June, 1997 Page 22
Hooking up to ISDN PRI
> that the PRI interface on the Cisco takes care of the DSU
> part. But not sure about the CSU part, as I'm a bit weak on what
> exactly a CSU does.
Here's the scoop: First, you can't use a DSU (or CSU/DSU) with the chan-
nelized T1 products because the CT1 cards need access to various parts of
the T1 protocol that are stripped out by a DSU. Remember, a DSU provides
you with (assuming B8ZS/ESF T1's) 1536KB as one "lump", or some subrate of
that if you're using fractional T1. It doesn't provide delineation of the
individual 56 or 64KB slots.
Next, the CT1 cards actually speak "DSX-1", which is the way T1's show
up at central offices, colocation points, and out of things like M13 mux-
es. Some things that speak DSX-1 don't bother implementing things like
loopback or the performance monitoring registers (since these devices are
intended for CO or colocation use, they don't have to comply with end user
requirements). The CT1 cards *do* support loopback and the performance
monitoring registers.
The other issue is that customer T1 spans can be delivered "wet" or "dry".
"Wet" implies that there's simplex power +/- 130V on the cable for powering
repeaters, etc. that the telco might need between them and your site. Almost
all modern telco-supplied T1's have some sort of network interface unit at
the demarcation point which can be looped up/down by telco control, as well
as stripping out any simplex power.
Since telcos will usually test to this NIU and if it tests good, they say
"problem's in your cable or equipment", so the diagnostic facilities that
might not be in another DSX-1 device (but *are* in the CT1) don't really
matter.
> Then there's the simple issue that the Cisco catalog expects a 15pin to
> 15pin cable from the PRI interface to the CSU, and any CSU/DSU we have
> has a V35 cable -- another reason I'm assuming that we need a CSU-only
> not a CSU/DSU.
>
> Can you or someone elaborate a bit on how this stuff all works, and (if
> it really is) how a CSU can be optionally replaced with a cable? Am I
> right in my highly vague idea of the CSU/DSU portions (and might
> someone give a better definitino of what each portion actually does)?
Hopefully the above clarifies things. I'd have no qualms whatsoever about
just making up the cable as I've described and using it - I've done this with
the Cisco 7xxx MIP card (which is the same as the 4xxx CT1 except it's for
the 7xxx and has 2 T1 ports instead of 1). At our colocation facility in
NYC, there are 3 other ISP's with MIP cards (a total of 15 more MIP cards -
30 T1's) all doing this.
If you want a cable for this, drop me a note and I'll make one up and send
it to you. Just let me know how long you want it. You might also want to
make sure that the T1 the phone company is giving you doesn't have simplex
voltage on it (you might be able to see the setting if it's in a clear case
- look for options like "SPAN power to CPE", etc. or call your phone company
and ask - it should be marked in their records. If you have a data installer
The DECUServe Journal June, 1997 Page 23
Hooking up to ISDN PRI
coming out regularly, he'll be able to check and let you know. Or you can
use a meter (set to a DC range >= 300V) and check the jack - no pin should
have more than about 6V to any other pin. Remember, you're looking at a high
voltage here, so be careful. [This is one of the reasons T1's usually *don't*
get handed to customers with simplex power on 'em.]
Note 1289.4, 19-May-1997
Ferguson: One day we'll find a way to stump Terry, but not this time
--------------------------------------------------------------------
>patch cable. Whack one end of the Cat 5 cable off, put the white/blue pair
>on DB-15 pins 1 and 9, and the orange/white pair on DB-15 pins 3 and 11.
>Polarity (which pin gets the white wire) doesn't matter, though you may
>have to swap the 1/9 and 3/11 sets (moving white/blue to 3/11 and so forth).
Bingo.
Note 1289.5, 19-May-1997
Kennedy: 8-)
------------
> -< One day we'll find a way to stump Terry, but not this time >-
But you're having so much fun trying... 8-)
> Bingo.
Out of fuel, or working? [From a computerist pilot, it's hard to tell]
Note 1289.6, 20-May-1997
Ferguson: I never understood the term "bingo fuel" anyway
---------------------------------------------------------
It's working.
I'm running out of fuel rapidly; sleep must come sometime this week (I
hope).
So both interpretations are valid.
On a related subject as to "working". The US West folks called me
back. Now remember that I had this connected and had calls going in
both directions just fine. They decided that it was mis-configured.
We were configured for "NI2" (who knows why, they say that's what we
ordered, which is possible), but that our router can't do NI2 (which I
think is correct), so that it cannot be working. So they want to
change it to be "custom" so it will work. Except its working.
But "it can't work that way". Sigh. Anyone want to take bets on
what happens after they "fix" it?
The DECUServe Journal June, 1997 Page 24
Laser Printer Error Code
Laser Printer Error Code
----- ------- ----- ----
Abstract:
The mysteries of the "50 SERVICE" message displayed on DEClaser 2150
(and other) laser printers are revealed in the following notes,
including recommended corrective actions.
Participants:
Rob Aldridge, Joe Gallagher, Terry Kennedy.
Conference: HARDWARE_HELP
Note 2141.0, 5-May-1997
Gallagher: What is meaning of DEClaser 2150 error messages?
-----------------------------------------------------------
When a DEClaser printer gives an error message of the form "NN SERVICE"
where NN is between 50 and 99, does anyone know the meaning of these
messages.
Note 2141.1, 5-May-1997
Kennedy: What's the exact code?
-------------------------------
If these are the printers that are descended from Canon engines (a la
the HP LaserJet II - if it takes a HP II cartridge it's one of these),
they indicate various problems - but you knew that. If you post the
exact code, I'll look it up in my HP/Canon service guides (this assumes
DEC didn't just change the codes to be perverse).
Some sample malfunctions are fuser too hot/cold, DC power problems.
etc.
Note 2141.2, 6-May-1997
Gallagher: Error code is "50 SERVICE"
-------------------------------------
Yes, the DEClaser 2150 uses the Canon engine and has the same "guts"
as an HP II.
The error code is "50 SERVICE". And thanks for any help you can
give.
The DECUServe Journal June, 1997 Page 25
Laser Printer Error Code
Note 2141.3, 7-May-1997
Aldridge: You may need outside help to fix error 50
---------------------------------------------------
From an HP 4 online manual:
50, 57 or 58 SERVICE The printer identified an internal service
error. If any of these errors appears, switch the printer off and
then back on. If the error continues, call you dealer or HP service
representative.Note To clear the 50 SERVICE error the printer must be
off for at least 10 minutes.
From a Laserjet Plus online manual:
50, 51, 52, 53, 54, 55, 60, 61, 62, 63, 64, 65, 67 Operational error
Contact HP Service
Just fyi - the Microsoft Technet subscription CD-ROM has an Ultimate
Printer Manual - which provides on-line manuals for most of the HP and
some other printers.
Note 2141.4, 7-May-1997
Gallagher: Thanks for help
--------------------------
Thanks for the infomation. I was expecting to have to take the
printer in for service. However, I was hoping for some understanding
of the problem. The manuals are not very forthcoming with detailed
information. Perhaps there is not _THAT_ much on board diagnostic in
these older printers; the printer diagnostics knows something is wrong,
but it may not be able to tell very much about what is really wrong.
Again, thanks.
Note 2141.5, 7-May-1997
Kennedy:
---------
50 SERVICE is a fuser malfunction. The troubleshooting table says:
1) Is the fusing assembly correctly seated onto its connectors on the
AC Power Mudule and base plate (left and right ends)?
2) Is the +24A (sic) voltage present?
3) Is the circuit breaker on the AC Power Module tripped?
4) Is the Thermistor defective?
5) is the Fuser Bulb open?
6) is the Thermoprotector open?
7) Are the cable assemblies defective?
The DECUServe Journal June, 1997 Page 26
Laser Printer Error Code
8) Is the AC Power Module defective?
9) Is the DC Controller PCA defective?
The troubleshooting info runs for just 5 pages, so if you send me your
FAX number, I'll FAX it to you.
Most "printer repair" places (at least around here) are crooks. You're
probably better off fixing it yourself (particularly since it's a DEC
variant of the unit). All of the possible causes for this problem are
generic parts, and you can get them from a place called Parts Now! (see
http://www.partsnowinc.com). A rebuilt fuser (with excange) is about $29-
$39, depending on whether you want new cosmetic parts. They can also do
exchanges on the other parts, or sell you individual fuser parts if you
want.
Alpha SMP Performance
----- --- -----------
Abstract:
The following discussion of Alpha (and VAX) performance issues on
SMP multiprocessor machines split off from another topic (Note 2125,
on a rather arcane matter of switching CI-based systems between
production and development environments) in the same conference.
Participants:
David Campen, Jack Harvey, Larry Kilgallen, Glenn Zorn.
Conference: HARDWARE_HELP
Note 2127.0, 7-Apr-1997
Campen: How well does Alpha SMP perform?
----------------------------------------
> -< HARDWARE_HELP >-
>=========================================================================
>Note 2125.0 CI Bus Switch
>EISNER::HARVEY "Jack at SIAC" 53 lines
>-------------------------------------------------------------------------
>
> So far, so good. For this phase, the two new hardware nodes (Alpha
> 4100s with 4 CPUS and 1 GB memory, if you must know) will become NI
> cluster members of an existing VAX cluster at V6.2. They will have
I'm curious, have any benchmarks been done to determine how many users
a 4 CPU system will support vs. a 1 or 2 CPU system.
The DECUServe Journal June, 1997 Page 27
Alpha SMP Performance
Note 2127.1, 12-Apr-1997
Harvey: What's a user?
----------------------
Oh, we had about five users with two CPUs, and it went up to about
eight with four.
:-)
[page break]
Sorry to fool around. Seriously, we are so far from a typical shop
that the concept of number of users simply doesn't arise here. There
are about eight different types of interfaces and a single user might
be active on all eight at once. About 40 nodes are supporting those
interfaces: X-terminals, VT420's, printers, wireless PCs. Maybe a
thousand people in one huge room, where whistles and hand signals are
used as much as Ethernet...
So far, our use of the 4100s has been limited to acting as Sybase
database servers. The only direct login users are operators. The
biggest challenge in using a four CPU node is getting Sybase to keep
all four doing useful work. It seems to prefer to contemplate its
navel.
Note 2127.2, 12-Apr-1997
Zorn: Users?
------------
Jack is right in .1 what is a user...
Our biggest problem with SMP boxes is CPU 0 hitting 70 to 100% on the
interrupt stack. (This is supposed to be fixed in V7.1 which will allow
for more of the IO to be performed on any CPU) Once the interrupt stack
hits about 50 or 60% performance actual will start to drop on the system
due to overhead synching. I can actually map the kbytes tranfered
dropping as the interrupt stack increases.
On our 8480s with 6 Gbytes of memory I can handle about 250 clients who
run an average of 2 processes each. Most of this is contrained by VMS
and memory limitations of V6.2 in configuring thier virutal address set
to 1.5 million pages. Again hopefully V7.1 will lift this and then
adding in more memory would lift that number. The systems are currently
bound by the above two factors when fully loaded.
Note: The majority of the IO is going through 4 CIPCAs to HSJ
controllers for disk read/write.
The DECUServe Journal June, 1997 Page 28
Alpha SMP Performance
Note 2127.3, 12-Apr-1997
Kilgallen: Buffer Objects may help
----------------------------------
Some applications will reduce such inner-mode overhead by switching
to use buffer objects instead of traditional SYS$QIO. As I recall
there were supposed to be enhancements to this in V7.1.
Note 2127.4, 13-Apr-1997
Campen: Limitations inherent to Alpha.
--------------------------------------
>Some applications will reduce such inner-mode overhead by switching
>to use buffer objects instead of traditional SYS$QIO. As I recall
>there were supposed to be enhancements to this in V7.1.
The above is a problem which limits SMP performance gains and is common to both
VAX and Alpha implementaions of VMS. Perhaps this will be improved in VMS 7.1.
The Alpha architecture, I believe, has its own limitations which I expect will
limit SMP performance improvements no matter what is done to the Operating
System. Consider multiple threads of an application or multiple applications
accessing a common data structure. To insure that the common data seen by the
threads or applications is consistent it is necessary to execute Memory Barrier
instructions to flush the pipelines and memory caches on the CPUs but it is
exactly these pipelines and memory caches that give the Alpha RISC architecture
its performance.
Note 2127.5, 13-Apr-1997
Kilgallen: Why is VAX slow ?
----------------------------
> The above is a problem which limits SMP performance gains and is common to
both
> VAX and Alpha implementaions of VMS. Perhaps this will be improved in VMS 7.1.
I am not sure that Buffer Objects are available to customers on VAX.
> The Alpha architecture, I believe, has its own limitations which I expect will
> limit SMP performance improvements no matter what is done to the Operating
> System. Consider multiple threads of an application or multiple applications
> accessing a common data structure. To insure that the common data seen by the
> threads or applications is consistent it is necessary to execute Memory
Barrier
> instructions to flush the pipelines and memory caches on the CPUs but it is
> exactly these pipelines and memory caches that give the Alpha RISC
architecture
> its performance.
The same problem exists on any SMP computer. You do not see it on VAX
because it is arbitrated in hardware. That is one of the many factors
which makes VAX slower than Alpha. What Alpha has done is expose these
considerations to software so the flushing operations only take place
The DECUServe Journal June, 1997 Page 29
Alpha SMP Performance
when absolutely necessary.
The answer to this situation is a common one which has been around for
years -- careful application design.
RAID System Disk?
---- ------ -----
Abstract:
Does the idea of making your system disk a RAID set make you at all
nervous? Would you be less nervous if you could ask trusted peers
and experts about it first? That's what DECUServe's there for....
Participants:
David MacLean, Bill Norton, Keith Parris.
Conference: HARDWARE_HELP
Note 2134.0, 17-Apr-1997
Norton: RAIDset for system disk?
--------------------------------
I'm running out of empty disk slots on my HSD30, and wondering about
using a 3-disk RAID5 set of RZ29's as - gasp - the system disk.
Has anyone tried this?
Is performance really as bad as "common knowledge" advises?
How about if the page & swap files are off the system disk - would it
still be unthinkable?
Note 2134.1, 17-Apr-1997
Parris: Go for it
-----------------
At one of my current client sites there are multiple system disks which are HSJ
controller-based stripesets (2 and 3 members) of 2-member mirrorsets, and they
work fine. VMS just thinks these are large disks, and it's very handy when you
have several large-memory nodes in a cluster with large dumpfiles to store.
(The client doesn't have RAID-5 keys or I might well have used that.) The
controller's write-back cache (which is a prerequisite for either RAID-5 or
mirroring anyway) tends to basically hide the latency of writes (unless and
until you get to the point where the drives in the array behind the controller
get saturated), and the extra spindles in either type of array will help with
the performance of reads.
System disks tend to have a fairly small percentage of writes. If you're
worried, you could always move most of the read/write files (page, swap,
The DECUServe Journal June, 1997 Page 30
RAID System Disk?
SYSUAF, queue files, operator logs, accounting files, etc.) off the system disk
first, to maximize the read/write ratio.
Particularly if you leave the page/swap files on the system disk, but even in
general, considering that PFCDEFAULT is 64 blocks by default, you should be
sure to raise MAXIMUM_CACHED_TRANSFER_SIZE on the unit from its default size of
32 so that page faults (PFCDEFAULT in size) or modified page writer writes (127
blocks) don't bypass the cache.
Note 2134.2, 18-Apr-1997
MacLean: I'm doing it, and enjoying it
--------------------------------------
I've run my production nodes (dual VAX 7630) with a RAID5 system disk,
via CI-connected dual HSJ40 in SW500 box for the last couple of years,
and have not seen any performance problems because of the (HSJ-based)
RAID5, other than when I turned off WRITEBACK cache for a few minutes
(and that was very sluggish).
While most of our user files and database stuff resides on other (total
of six) RAID5 sets, most of my PAGE (and SWAP, not that it gets used)
files are off the system disk, on non-RAID volumes.
OSF/1 Disk Woes
----- ---- ----
Abstract:
In the following, we begin with a disk failure on an OSF/1 system.
The vendor sends a replacement drive, the drive is installed, and
the system refuses to come back, preferring to spew cryptic error
messages at startup and then hang. Now what?
Participants:
Bruce Bowler, Dale Coy, Mike Miller.
Conference: UNIX_OS
Note 350.0, 1-Apr-1997
Bowler: cam_logger errors installing OSF/1
------------------------------------------
Situation...
Alpha "clone" - disk goes bad. To remove the disk, I had to remove all
of the cards to drop the cage the drive was in to get to the mounting
screws on "the other side". Put everything back together with a new
The DECUServe Journal June, 1997 Page 31
OSF/1 Disk Woes
drive in place. Boot to the OSF installation CD. Tell it I want to do
a "basic" installation. It asks a few questions about which disk is to
be the system disk (rz3). Then the dia(mono?)log goes something like
this...
initializing the system disk
working
system disk has been initialized
checking root file system
cam_logger: CAM_ERROR packet
cam_logger: bus 0 target 3 lun 0
ss_perform_timeout
timeout on disconnect request
cam_logger: CAM_ERROR packet
cam_logger: bus 0 target 3 lun 0
ss_perform_timeout
timeout on disconnect request
Reached max abort count scheduled bus reset
cam_logger: CAM_ERROR packet
cam_logger: bus 0
aha_bus_reset
Resetting the SCSI bus at request
Then everything hangs... The "activity" led on rz3 is on steady at
this point. Apparently the only way to clear it when it gets to this
point is to power cycle the machine. The original drive was a DEC
DSP3210. The new drive is brand x yy3210 (sorry, I'm not at the
machine right now and don't have the info written down.)
Any ideas on how to get the system installed on this disk?
Note 350.1, 1-Apr-1997
Bowler:
--------
A couple other notes that may help...
The SCSI chain looks like this...
Adaptec AHA 1740/42a+---+disk+----+tape+----+CDROM+----+term pack
(SCSI ID) 3 6 4
There is a bank of 3 sets of 10 holes next to where the scsi ribbon
cable connects to the adapatec card that look like they might be a
place to put the little termination resistors, but they're empty (and
they were before too). I'm wondering f there's a termination issue
here, but I don't knoe enough about SCSI to know for sure.
I can get to the CDROM drive with no problem... Don't know about the
tape drive, but when the system is initting the SCSI during power up
the lights light on the front of it so I'm pretty sure the controller
"sees" it.
The DECUServe Journal June, 1997 Page 32
OSF/1 Disk Woes
Note 350.2, 1-Apr-1997
Coy: Get one
------------
> The SCSI chain looks like this...
>
> Adaptec AHA 1740/42a+---+disk+----+tape+----+CDROM+----+term pack
> (SCSI ID) 3 6 4
I presume that's exactly the way it's cabled -- and presume that all of
the drives are INternal (hard to tell from your description).
> There is a bank of 3 sets of 10 holes next to where the scsi ribbon
> cable connects to the adapatec card that look like they might be a
> place to put the little termination resistors, but they're empty (and
> they were before too). I'm wondering f there's a termination issue
> here, but I don't knoe enough about SCSI to know for sure.
The rule is that there must be termination on BOTH ENDS and NOWHERE
ELSE. [If there are drives on "both sides" of the Adaptec card, then
there should be no termination "on" the card]
First guess -- there USED to be a terminator plug on the "outside
connector" of the 1740, and it isn't there now.
Or -- you need termination resistors on the card.
Or -- the cabling isn't really that way.
Of course, I'm not fully familiar with the 1740 -- but on the Adaptec
www site (obvious address), there are copies of "all of the
documentation" for all of their cards (in PDF format as I recall). Go
get the doc. for the 1740, and check if it has "software termination"
or "hardware termination".
Advice: get an external terminator plug, attach it to the back of this
thing, and see if it fixes the problem.
Note 350.3, 1-Apr-1997
Bowler:
--------
> I presume that's exactly the way it's cabled -- and presume that all of
> the drives are INternal (hard to tell from your description).
Yes on both accounts.
> ELSE. [If there are drives on "both sides" of the Adaptec card, then
> there should be no termination "on" the card]
All on "one side" of the card.
> First guess -- there USED to be a terminator plug on the "outside
> connector" of the 1740, and it isn't there now.
The DECUServe Journal June, 1997 Page 33
OSF/1 Disk Woes
I haven't seen one "lying on the floor".
> Or -- you need termination resistors on the card.
Which leads to another question... Are all termination resistors
created equal or are some "more equal" than others?
> Advice: get an external terminator plug, attach it to the back of this
> thing, and see if it fixes the problem.
I would, except it's got this "really weird" (i.e. non-standard)
external plug, the likes of which I haven't seen before.
I'll check out their site tonight...
Note 350.4, 11-Apr-1997
Miller: maybe a dying disk?
---------------------------
I had almost the exact same problem!
The only way I could clear the "solid lit" drive was power off/on.
Turns out I had a dying RZ26. Use some SCU command like this:
SCU> switch /dev/whatever
SCU> show defects
If the list is big and getting bigger, you probably need to replace the
drive.
Note 350.5, 11-Apr-1997
Bowler:
--------
Despite the vendor saying "we tested that disk thoroughly before we
sent it to you", that's exactly what it was...
About the DECUServe Journal
---------------------------
The DECUServe Story
DECUServe is an electronic conferencing system, somewhat related to
bulletin board systems but much larger and more organized. It is
devoted to the general area of computer technology such as systems,
software, hardware, and communication, in the Digital and related
third party vendor market area.
The DECUServe Journal June, 1997 Page 34
About the DECUServe Journal
DECUServe also has complete access to and from the Internet. Usenet
Newsgroups are accessible using newsreaders from DECUServe and the
comp.os.vms newsgroup is added to a VAX Notes conference of its own.
The conferencing system is available nearly 24 hours a day, seven
days per week. There is no hourly connect charge. The subscriber
pays communication costs to a phone number in eastern Massachusetts.
Reduced rate communication services are available in some areas and
INTERNET access is available (node - eisner.decus.org).
Subscriptions must be used by a single person. Company or group
subscriptions are not available, nor may subscriptions be
transferred.
DECUServe uses the Digital VAX Notes conferencing software. We
currently have over 50 technical conferences available on subjects
such as Security, the VMS Operating System, ALL-IN-1, Databases,
Site Management, Personal Computing, DEC Networking, Third Party
Software, Hardware, Workstations, the World Wide Web and many more.
Over 130,000 technical notes are on line. All conferences,
including the Frequently Asked Questions (FAQ) from the Usenet
newgroups, are indexed to allow for fast text content searches.
You can obtain up to the date statistics and information via the
World Wide Web at the URL
http://WWW.DECUS.ORG/decus/decusv/index.html which provides a number
of options. One option displays the activity in each of the
technical conference. Another option allows you to read issues of
the DECUServe Journal which is published worldwide every month and
contains samples of the discussions that occur 24 hours a day.
If you have access to Internet mail, you can receive a DECUServe
Application form directly. Send mail to
application@eisner.decus.org -- the mail text may be blank. On-line
subscription information is available in the U.S. by dialing
1-800-521-8950 and logging in with username INFORMATION.
Publication Information
Topic threads in the DEC Notes conferences on DECUServe are selected
for publication on the basis of strong technical content and/or
interest to a wide audience. They are submitted to the editor from
various sources, including DECUServe Moderators, Executive Committee
members, and other volunteers. Suggestions for inclusion are
enthusiastically solicited. Articles selected for publication are
edited on an OpenVMS VAX system in TPU and then formatted with
Digital Standard Runoff.
The DECUServe Journal June, 1997 Page 35
Contact Information
Contact Information
-------------------
The editors of the DECUServe Journal are Brian and Sherrie McMahon.
They can be reached by any of the following means:
mcmahon_b@decuserve.decus.org
mcmahon_s@decuserve.decus.org
mcmahonb@decus.org
griffith@decus.org
bmcmahon@cisco.com
Work phone: +1 408 527 0434