From: MERC::"uunet!CRVAX.SRI.COM!RELAY-INFO-VAX" 2-NOV-1992 19:43:41.12 To: info-vax@kl.sri.com CC: Subj: Re: Alternate Type Ahead Buffer Questions In article <1992Oct30.163055.15639@cs.tulane.edu>, Jeff E Mandel writes: > In article <1992Oct30.023455.813@cmkrnl.com> , jeh@cmkrnl.com writes: >>> I have a process that is trying to drain a 19.2K line into a file. >>> To allow it time to process, I have split the process into two > processes >>> that communicate via a global section. Thus the first process issues >>> 500 byte QIO reads on the terminal line and places the results into >>> the circular buffer global section. >> >>Unless you are on a multi-CPU (SMP) VAX, there is no advantage to > splitting the >>work between two processes. Just use an AST-driven thread to read the > data. > > Well, you may want to refer back to the thread on shared VM_ZONES we had > last month. The advantage to splitting it into two processes is that it > gives you the ability to buffer with less "costly" memory if your writing > process gets blocked. Basically, the approach I took was to use the PPL$ > library, and create a shared vm_zone. Now I create a task that reads from > the serial line, allocates a buffer in the shared zone with lib$get_vm, > and places that buffer into a PPL$ work queue. The second task reads the > work queue, inserts the data in the record into the file (in my case an > Rdb database), and lib$free_vm's the buffer. The advantages of this are: > 1) Your bufferring is in virtual memory, rather than in a fixed-size > global section, or worse, in nonpaged pool (if you use a great big > typeahead buffer). There is really no difference in terms of the "cost" of the memory between what you have described and either doing the global section yourself (just how do you think shared zones are implemented, anyway?), or LIB$GET_VM with the reading thread (AST-driven) and the writing thread (either AST-driven, or at the non-AST level) in the same process. As far as the creation of virtual address space is concerned, these are just different interfaces to the same mechanisms. The memory is virtual in each case. A "fixed size global section" has a maximum size, all right, but so does a VM zone (16 megabytes; a global section can be larger, various sysgen and uaf parameters permitting). As far as physical memory is concerned, the global section is just as "virtual" as as-yet-unallocated parts of a VM zone. The only difference is that for the global section, page table entries are created for the entirety of the section at one time, while the VM zone can cause a gradual expansion of virtual address space. Either way, no physical memory is used until pages are faulted in (only pages at a time, and I might add that you can't control the pfcsize on a VM zone as you can with the $CRMPSC service), and can be paged out of the process[es]' working set[s] as needed. Using a great big typeahead buffer: Funny you should mention that; we have to do that sometimes in uucp, to support very large packets + large windows. It's necessary when you can't count on the code that's reading the serial port to wake up fast enough. Even then it's limited to a few tens of kilobytes (seven packets in the window * 4Kbytes/packet). (As a practical matter, no one really needs to run with windows and packets that large to optimize uucp throughput.) This is probably about the most data anyone should count on keeping in the typeahead buffer, since there are a lot of systems out there where finding a chunk of pool that size is a chancy thing. (Yes, I know about NPAGEVIR. There are a lot of system managers who set it equal to NPAGEDYN, thinking that they're saving on memory... life would be lots simpler if one could write applications only for well-managed systems!) > 2) The buffer is a queue, rather than a circular buffer. You don't have > to worry about overwriting live data (until you exhaust vm) Again... there is no need to go to the PPL$ routines and shared VM zones to get this behavior. Nor will these routines be any cheaper in terms of physical memory. For this particular problem, reading data from a 19.2Kbps line (about 2000 characters per second), there really shouldn't be much problem keeping up unless the data must be read with very small $QIO buffers. For example, uucp on a MicroVAX 3600 can absorb data from a Trailblazer modem at 1400 bytes/sec, and write it to disk, and leave at least 70% of the CPU free. This involves doing two $QIO reads for every 64 bytes of data (one for the 64 bytes, and one for the 6-byte header that precedes it), PLUS a $QIO write to send the ACK to the modem for each 64-byte packet. At 2000 char/sec I wouldn't worry about overrunning a circular buffer unless the CPU is VERY busy (or slow). Now: I agree that using LIB$GET_VM to allocate chunks of memory, and queueing these to a listhead, etc., as you have described, is a good technique for queueing messages from one execution thread to another, whether the two threads are part of the same process (using ASTs, and a process-private VM zone) or are in differrent processes (using the PPL$ library). It's a heck of a lot easier than setting up a section (global or private) and managing your own space allocation! However there are time penalties to be paid for all that convenience. The closer you are to the edge of acceptable performance, the more you need to bypass the "convenient" interfaces and do the work yourself. And the mechanisms behind the convenient interfaces need to be understood before implementation decisions can be made. I can't resist just one more nit: You do need to be aware that if your second thread is running in its own process (whether under control of the PPL$ routines or not), there will be a slight performance penalty due to process context switching, vs. switching between AST contexts in a single process. The size of the penalty depends on how many context switches happen per second, the virtual addressing behavior of the processes (since a context switch causes the translation buffer to be flushed), the speed of the CPU (although process context switch speed doesn't scale linearly with CPU speed), etc., etc. I can't quote numbers for this; I can tell you that this penalty is unmeasurable on a MicroVAX II that's doing 50 context switches a second (quantum set to 2). (ie two compute-bound processes can get the same amount of work done whether quantum is set to 2 or 200.) On modern VAXes you probably don't need to worry about this until you are doing a few hundred or more context switches/sec. > 3) If you decide you want to use a new version of the writing process, > kill the old one and start a new one. You don't have to shut down the > serial line while you restart the program, as you would in a AST threaded > single process. yep. The application permitting, you can even set up multiple processes to pick the data out of the queue. Suppose that instead of writing the data to disk the job was to summarize the data in some way. On an SMP machine you could use one process in one CPU to grab the data blocks and put them on the queue, and n-1 "summarizer" processes. > 4) When someone gives you an SMP VAX for Christmas, you have a parallel > application all ready to go! Not likely to happen here, but I'll keep it in mind. :-) > Jeff E Mandel MD MS > Associate Professor of Anesthesiology > Tulane University School of Medicine > New Orleans, LA > > mandel@vax.anes.tulane.edu --- Jamie Hanrahan, Kernel Mode Consulting, San Diego CA drivers, internals, networks, applications, and training for VMS and Windows-NT uucp 'g' protocol guru and release coordinator, VMSnet (DECUS uucp) W.G., and Chair, VMS Programming and Internals Working Group, U.S. DECUS VAX Systems SIG Internet: jeh@cmkrnl.com, hanrahan@eisner.decus.org, or jeh@crash.cts.com Uucp: ...{crash,eisner,uunet}!cmkrnl!jeh