From: SMTP%"RELAY-INFO-VAX@CRVAX.SRI.COM" 27-AUG-1993 10:20:40.45 To: EVERHART CC: Subj: Re: VMS time quadword puzzle From: jeh@cmkrnl.com X-Newsgroups: comp.os.vms Subject: Re: VMS time quadword puzzle Message-ID: <1993Aug26.201759.2601@cmkrnl.com> Date: 26 Aug 93 20:17:59 PDT Organization: Kernel Mode Systems, San Diego, CA Lines: 99 To: Info-VAX@KL.SRI.COM X-Gateway-Source-Info: USENET In article <01H27D8G0DQA000H1T@GW.AGR.CA>, Jim Wild writes: > I have a little command procedure that picks a thought for the day (TFD) out > of an indexed file containing N TFDs. I want the TFD to be picked 'randomly' > as the user logs in so I wrote the following bit of code to come up with a > number somewhere between 1 and N by taking the system time quadword mod N > and adding 1. Yep. My version of COOKIE does the same thing, except that all of the messages are assembled into it, and the time modulo n is used as an index into an array of descriptors. > The time quadword is grabbed as an 8 byte string into symbol QTIME and then > the two low order bytes are extracted as an integer into symbol I which is > taken modulo N. (You cannot directly take the time quadword into a numeric > symbol since DCL symbols are stored as longwords.) The address of the system > time quadword here is for VMS 5.5, it may be different in other (major?) > versions. > > $ QT_LOC = %X80004450 ! address of system time quadword > $ QTIME = F$FAO("!AD",8,QT_LOC) ! get current time quadword > $ I = F$CVUI(0,16,QTIME) ! use two lowest order bytes > $ I = I - (I/N)*N + 1 ! find I in range 1-N > > I thought that the time quadword, which counts time in hundredths of seconds, > grabbed as a user logs in, would be a suitable source for a random key into > the TFD file. Just to confirm that this obvious assumption was correct, I > put a counter on each TFD record that is incremented whenever that record is > selected. Imagine my surprise to discover that the TFDs are not selected > without bias. In fact, there is a strong pattern to the selection. Taken > modulo 8, TFD 1 is selected about 3x more often than TFDs 3, 5, and 7 which > are selected about 3x more often than TFDs 2, 4, 6, and 0. Yep. You have several problems. First, the time quadword does not count time in hundredths of seconds. Rather, its value represents time in 100-nsec increments. Its granulartiy, though, at least on VAX, is only 0.01 seconds (the rate at which the hardware interval timer generates interrupts). So the value added to the quadword time every 1/100 second is normally 100,000, which is 0.01 / 0.0000001 . (This is the default value of EXE$GL_TICKLENGTH.) First conclusion: Unless you have been playing games with EXE$GL_TICKLENGTH (who, me?), the value in the quadword will always be evenly divisible by 100,000. The "number of zeroes" rule shows that it's divisible by 32 as well. (A decimal number ending in a series of 'n' zeroes is divisible by 2**n.) You can see that this radically limits the number of different values which you'll see in EXE$GQ_SYSTIME , at least in terms of mod(small number). To make matters worse, you're looking only at the lower 16 bits. 100000 decimal is 186A0 hex; the value added to the counter doesn't even fit in the bits you're looking at. So the value you see will always be of the form (n*100000) modulo 65536 . It'll still always be evenly divisible by 32, as the low-order five bits will always be zero. The rest of the puzzle depends on what 'n' is. If it's a power of 2 up thru 32, your modulus operation will always yield 0. If it's 64, the modulus operation will always yield either 0 or 32. Things get very complex for non-power-of-2 values of 'n' -- see a number theorist. :-) But anyway, I'm sure you can see by now that a number that's of as strict a form as (n*100000) modulo 65536 does not make a very good random number seed. The right way to use EXE$GQ_SYSTIME as a random number seed is to first divide it, as a quadword, by 100000. This gets you a number that is incremented 1, 2, 3 instead of 100000, 200000, 300000. A more convenient source of random number seeds from DCL might be EXE$GL_ABSTIM or EXE$GL_ABSTIM_TICS (check with SDA or SYS.MAP for locations). These are the number of seconds and 1/100 sec ticks, resp., since the system was last bootstrapped. Either way, you get numbers that increment in units, not in large jumps, and the value fits in a longword so you can manipulate it from DCL. (just fyi, I believe that one of these is the origin of the "Uptime" display in SHOW SYSTEM.) > When I noticed this, I ran another little test putting the lines of code > above into a loop generating 100000 values for I (setting N=300). The > results were that only values 1, 5, 9, 13, ... (n-1)*4+1 showed up. > > Though this is a different result than I observed with the original program > which generated a value of I whenever a user logged on, it still shows that > there is something peculiar about the way in which the system time quadword > is accessed (or updated). Isn't this obvious? Your loop is taking some fairly constant amount of time per pass. Whether it's some number of ticks or a fraction of a tick, this means you're going to get into a fairly deterministic phase relationship with the clock values. Add that to the problems already described and you can expect to see a fairly small number of different values out of your modulus operation. Using a high-granularity clock as the seed for a pseudo-random sequence generator is a fine technique, but you can't expect to sit in a tight loop and pull *successive* randomly-distributed numbers out of the clock. --- Jamie Hanrahan, Kernel Mode Systems, San Diego CA drivers, internals, networks, applications, and training for VMS and Windows NT uucp 'g' protocol guru and release coordinator, VMSnet (DECUS uucp) W.G., and Chair, Programming and Internals Working Group, U.S. DECUS VMS Systems SIG Internet: jeh@cmkrnl.com (JH645) Uucp: uunet!cmkrnl!jeh CIS: 74140,2055