DOCUMENTATION FOR THE (ISD) VMS PERFORMANCE MONITORING SYSTEM CLUSTER VERSION (DRAFT) S. C. Spriggs March 6, 1986 Modified for Cluster Use V. C. Svatos June 15, 1987 * No claims are made as to this system's appropriateness in all VMS environments. Overview -------- The ISD VMS Performance Monitoring System (PMS) gives system management information on: service level (response time), idle CPU time, average users, disk I/O, free memory, pagefaulting and disk capacity. There are seven command procedures which run in batch queues and daily deposit data in files which are plotted as needed with datatrieve procedures. The cluster version requires that a separate directory be created for performance data for each node. A logical must be created to point to the directory, for example, PERFORM5 points to the directory for one node; PERFORM6 to another, etc. The command procedures must be modified so that the correct directories are pointed to. The data are only collected during prime-time hours excluding the lunch period. These are the seven command procedures: --------------------------------------- - PERFTIME.COM Runs a standard task at a specified interval and measures how long it takes to do the task. On a daily basis, the times required to do the task are averaged and divided by how long it takes to do the standard task on a system with no users on it -- the result is the Service Level for that day. This job is the only one run from a priority 4 batch queue. - NOUSERS.COM Samples the number of users logged in and averages it for the day - CPUIDLE.COM Uses the VMS MONITOR utility to sample CPU idle time and averages it for the morning and afternoon - PAGEFAULT.COM Uses the VMS MONITOR utility to sample total page faults and averages them for the morning and afternoon - MONIO.COM (Submitted to batch que on only one node) Submits a job (MONxIO) to the batch que on each node which uses VMS MONITOR to get morning and afternoon averages of I/O operations to each disk from each node in the cluster. A summary is created for total I/O to each disk. This procedure should always be holding in the queue for its next run. The MONxIO procedures will show as executing in the batch queue on each node. - FREEMEM.COM Calculates available free memory at specified intervals throughout the day and places the daily average in a file for plotting. - PCENDISK.COM (Runs on only one node) Submits PCENTDISK.COM which daily calculates how much of the total capacity of each disk is in use. This job is resubmitted daily to run at 6:00 A.M. These seven jobs use a combined total of about three CPU minutes/day on a 785 during prime-time. The graphs can be produced at any time by using a graphics terminal with an attached LA-50 or LA-100 printer. The graphs are produced by typing @PERFPLOT. This is a brief explanation of most of the files in the [.PERFORM] directory: ------------------------------------------------------- .COM FILE SIZE DESCRIPTION ----------- ---- ----------- COMBINEIO.COM 1 - Executed by MONIO.COM to provide a cluster-wide average of I/O to each disk. CPUIDLE.COM 10 - explained above DAILYNUM.COM 4 - executed by PERFTIME.COM to calculate the daily service level DAILYUSER.COM 4 - executed by NOUSERS.COM to calculate the average users for the day FREEMEM.COM 13 - explained above MONIO.COM 15 - explained above NODEIO.COM 3 - Submitted to each node in the cluster as MONxIO.COM by MONIO.COM. The node number is used for "x" in the file name. MONSUB.COM 5 - executed during system startup to submit the folowing monitoring jobs to the batch que on one node. PAGEFAULT.COM FREEMEM.COM CPUIDLE.COM NOUSERS.COM MONIO.COM MONSUBx.COM 6 - executed during system startup of other nodes to submit the following monitoring jobs to these nodes: PAGEFAULT.COM FREEMEM.COM CPUIDLE.COM NOUSERS.COM "x" in the file name indicates the number of the VAX. NOUSERS.COM 10 - explained above PAGEFAULT.COM 9 - explained above PCENDISK.COM 1 - explained above PCENTDISK.COM 7 - executed by PCENDISK.COM to calculate the percent disk utilization PCENTDISK_RA82.COM 8 - Modify and rename to PCENTDISK.COM if there are RA82's on the system. PERFBENCH.COM 11 - used to determine a benchmark for how long the standard task (run by PERFTIME.COM) takes on an "empty" system PERFPLOT.COM 2 - runs DTR and prints the graphs of the data in the *.DTR files PERFPLOT_SCREEN.COM 3 - runs DTR and sends graphs to screen with no hardcopy PERFTIME.COM 11 - explained above ----------------------------------------------------------------- .DAT FILE SIZE DESCRIPTION ----------- ---- ----------- BENCH_MARK.DAT 1 - contains the benchmark for the system (how long it takes the standard task to complete on an "empty" system) DAILYCHK.DAT 1 - produced by DAILYNUM.CHK DAILYNUM.DAT 1 - produced by DAILYNUM.COM ----------------------------------------------------------------- The *.DTR files contain the data produced by the 7 batch jobs described above -- the files are updated either once or twice daily -- these are the files which PERFPLOT procedures access to produce the graphs. All have an original size of 0. .DTR FILE DESCRIPTION ----------- ----------- CPUIDLBCK.DTR Historical data for CPU idle time DU0CAPBCK.DTr % utilization for disk DUA0: DUAxCAPBCK.DTR One file for each disk drive DUAxIOBCK.DTR One file for each disk drive NOUSERBCK.DTR Average number of users PAGEFLTBCK.DTR Pagefaults FREEMEMBCK.DTR Free Memory PERTIMBCK.DTR Individual data points for how long it took to run the standard task SVCLEVBCK.DTR Daily service level calculation (ratio of average time to complete the standard task divided by the benchmark) ----------------------------------------------------------------- These files contain the historical data which are no longer plotted with PERFPLOT.COM. Original sizes are all 0. A years worth of data will use 250 - 500 blocks depending on data stored. .BCK FILE DESCRIPTION ----------- ----------- CPUIDLBCK.BCK DUAx0CAPBCK.BCK One file for each disk drive DUAxIOBCK.BCK One file for each disk drive NOUSERBCK.BCK NOUSERS.BCK PERFTIME.BCK FREEMEMBCK.BCK ----------------------------------------------------------------- FILE SIZE DESCRIPTION ----------- ---- ----------- PERDTRDEF_CLSTR 13 - datatrieve DOMAIN, RECORD and PROCEDURE definitions for plotting the data in the *.DTR files -- it is executed by PERFPLOT.COM ----------------------------------------------------------------- DAILYNUM.CHK 4 - can be executed at any time to determine what the service level is up to that time of day DRAX.EMP 0 - used by several of the command procedures FILE1.TXT 70 - used by PERFTIME.COM during the standard task PERFCOUNT.FIL 18 - used by PERFTIME.COM NORUNDAY.DAT 1 - used by the monitoring jobs to determine weekends and holidays during which the jobs do not run. This file must be maintained with future holidays. DISKIOSUM.BCK Summary of AM and PM IO to all disks from all nodes. Data are placed here by MONIO.COM. ----------------------------------------------------------------- Files which are created and used by the 6 monitoring jobs. DUAx.TWO 0 - One for each disk drive JCPUIDLE.TMP 1 JMONIO.TMP 1 JNOUSERS.TMP 1 JPAGEFAULT.TMP 2 MODES.RPT 16 MONMODES.EMP 0 MONMODES.TMP 0 MONPAGE.EMP 1 MONPAGE.TMP 1 NOUSERS.EMP 0 NOUSERS.RPT 1 NOUSERS.TWO 1 Installation Hints ------------------ The command procedures use logicals of the form " PERFORM5", "PERFORM6", etc., which point to the directories which store the data collected for each node in the cluster. Since 5 of the 7 jobs are in an execute state (except for the weekend) those 5 can just be resubmitted at system startup. (If the system reboots on a weekend, you may get duplicate jobs in the queue -- there is a command procedure MONSUB.COM which checks to see if the jobs are in the queue before it submits them, but this does not always work for some strange reason.) You may have to set up a priority 4 batch queue for PERFTIME.COM (service level calculation) on each node. The DATATRIEVE graph labels in PERDTRDEF_CLSTR.* can be changed as needed. You will also need to develop DATATRIEVE domain and record definitions and procedures for each of the data files you want to plot. Just use the existing definitions and change the names as appropriate. Maintenance ----------- The low priority batch queue and the priority 4 batch queue should be periodically checked to see that the seven jobs shown below are executing. (These jobs will not be executing on weekends and holidays.) MONSUB.COM submits the jobs at reboot time without a log. If any of the batch jobs die, they should be resubmitted with a log so the problem can be found when they die again. The file NORUNDAY.DAT has to be changed yearly to include that year's holdays. The *.DTR files can be accessed with an editor to remove historical data and reduce the time span for which the data are plotted. How the jobs should appear in the queues ---------------------------------------- This is a typical queue for running the five low priority monitoring jobs: Batch queue ?????4_BATCH, on ?????4:: /BASE_PRIORITY=1 /JOB_LIMIT=12 /OWNER=[NOCHARGE,SYSTEM] /PROTECTION=(S:E,O:D,G:R,W:W) Jobname Username Entry Status ------- -------- ----- ------ CPUIDLE SITE 1865 Executing PAGEFAULT SITE 1867 Executing *MONxIO SITE 1868 Executing NOUSERS SITE 1875 Executing FREEMEM SITE 1876 Executing PCENDISK SITE 33 Holding until 7-MAR-1986 06:00 * "x" indicates node. It is not in the queue during hours it does not sample. When not executing, MONIO.COM will be shown as "Holding" in the batch queue from which MONIO is submitted. This is the way the queue for running PERFTIME.COM should look: Batch queue PERFORMANCE, on ?????4:: /BASE_PRIORITY=4 /JOB_LIMIT=1 /OWNER=[NOCHARGE,SYSTEM] /PROTECTION=(S:E,O:D,G:R,W:W) Jobname Username Entry Status ------- -------- ----- ------ PERFTIME SITE 1177 Executing Guidelines ---------- - Service Level EUCO's experience is that users are generally satisfied with response time if the service level stays at 3 or below. - Idle CPU On a system with minimum batch processing an idle time of around 30-40% is required to keep the service level at 3 or better. - Average Users Maximize and maintain target service level - I/O Balance it across all disks -- a 3-hour average of 15 I/O ops per second probably means users are experiencing I/O wait time. Maximum capacity is around 35 per second. DEC documentation states that 8 I/O ops per second indicates slight usage on an RA81 and 15 per second indicates moderate usage. I/O of about 25 indicates a potential bottleneck. - Pagefaulting Watch for trends -- VMS manuals discuss pagefaulting. - Disk Capacity Operation above 80% full is risky unless there is no upward trend. Other Command Procedures ------------------------ CHKCPUTIM.COM runs the ACCOUNTING utility, summarizes the CPU time for the day specified, and mails the DATATRIEVE summary report to the users specified. This is useful for investigating poor service level (after the fact) to help determine if any corrective action is implied. The file DTRDEF.CPUCHK contains the DATATRIEVE definitions used by CHKCPUTIM.COM. MORNING_REPORT.COM formats a report with data from the previous day which shows service level, maximum users and average users. CLUSTER_REPORT.COM reports previous days data from a clustered system. Attachment 1 - Service Level Calculation ---------------------------------------- SERVICE LEVEL CALCULATION ------------------------- Elapsed time to perform a standard task - all DCL commands |----------> Different for each CPU - 750, 780,... T Bench 750 - 20 Seconds 8600 - 12 Seconds Sample each 30 minutes during prime time excluding lunch |-------------> T1 |-----------------> T2 |---------------> T3 |-------------------> T4 . . . |---------------------------> TN T i Sum 1-N ---- N --------- = Service Level (daily) T Bench Attachment 2 - Major Indicators of System "Health" and Excess Capacity ------------------------------------------- - Response Time - Idle CPU - No of Users - Page Faulting - Disk I/O - Disk Capacity - Free Memory