HP OpenVMS Systems Documentation
OpenVMS Performance Management
|Example 7-1 Procedure to Obtain Working Set Information|
$! $! WORKING_SET.COM - Command file to display working set information. $! Requires 'WORLD' privilege to display information $! on processes other than your own. $! $! the next symbol is used to insert quotes into command strings $! because of the way DCL processes quotes, you can't have a $! trailing comment after the quotes on the next line. $! $ quote = """ $! $ pid = "" ! initialize to blank $ context = "" ! initialize to blank $! $! Define a format control string which will be used with $! F$FAO to output the information. The width of the $! string will be set according to the width of the $! display terminal (the image name is truncated, if needed). $! $ IF F$GETDVI ("SYS$OUTPUT", "DEVBUFSIZ") .LE. 80 $ THEN $ ctrlstring = "!AS!15AS!5AS!5(6SL)!7SL !10AS" $ ELSE $ ctrlstring = "!AS!15AS!5AS!5(6SL)!7SL !AS" $ ENDIF $! $! Check to see if this procedure was invoked with the PID of $! one specific process to check. If it was, use that PID. If $! not, the procedure will scan for all PIDs where there is $! sufficient privilege to fetch the information. $! $ IF p1 .NES. "" THEN pid = p1 $! $! write out a header. $! $ WRITE sys$output - " Working Set Information" $ WRITE sys$output "" $ WRITE sys$output - " WS WS WS WS Pages Page" $ WRITE sys$output - "Username Processname State Extnt Quota Deflt Size in WS Faults Image" $ WRITE sys$output "" $! $! Begin collecting information. $! $ collect_loop: $! $ IF P1 .EQS. "" THEN pid = F$PID (context) ! get this process' PID $ IF pid .EQS. "" THEN EXIT ! if blank, no more to $! ! check, or no privilege $ pid = quote + pid + quote ! enclose in quotes $! $ username = F$GETJPI ('pid, "USERNAME") ! retrieve proc. info. $! $ IF username .EQS. "" THEN GOTO collect_loop ! if blank, no priv.; try $! ! next PID $ processname = F$GETJPI ('pid, "PRCNAM") $ imagename = F$GETJPI ('pid, "IMAGNAME") $ imagename = F$PARSE (imagename,,,"NAME") ! separate name from filespec $ state = F$GETJPI ('pid, "STATE") $ wsdefault = F$GETJPI ('pid, "DFWSCNT") $ wsquota = F$GETJPI ('pid, "WSQUOTA") $ wsextent = F$GETJPI ('pid, "WSEXTENT") $ wssize = F$GETJPI ('pid, "WSSIZE") $ globalpages = F$GETJPI ('pid, "GPGCNT") $ processpages = F$GETJPI ('pid, "PPGCNT") $ pagefaults = F$GETJPI ('pid, "PAGEFLTS") $! $ pages = globalpages + processpages ! add pages together $! $! format the information into a text string $! $ text = F$FAO (ctrlstring, - username, processname, state, wsextent, wsquota, wsdefault, wssize, - pages, pagefaults, imagename) $! $ WRITE sys$output text ! display information $! $ IF p1 .NES. "" THEN EXIT ! if not invoked for a $! ! specific PID, we're done. $ GOTO collect_loop ! repeat for next PID
7.1.4 Displaying Working Set Values
The WORKING_SET.COM procedure produces the following display:
|Example 7-2 Displaying Working Set Values|
Working Set Information WS WS WS WS Pages Page Username Processname State Extnt Quota Deflt Size in WS faults Image SYSTEM ERRFMT HIB 1024 512 100 60 60 165 ERRFMT SYSTEM CACHE_SERVER HIB 1024 512 100 512 75 55 FILESERV SYSTEM CLUSTER_SERVER HIB 1024 512 100 60 60 218 CSP SYSTEM OPCOM LEF 2048 512 100 210 59 5764 OPCOM SYSTEM JOB_CONTROL HIB 1024 512 100 360 238 1459 JOBCTL SYSTEM CONFIGURE HIB 1024 512 100 125 121 101 CONFIGURE SYSTEM SYMBIONT_0001 HIB 1024 512 100 668 57 67853 PRTSMB DECNET NETACP HIB 1500 750 175 1200 812 10305 NETACP DECNET EVL HIB 1024 350 175 210 33 84080 EVL SYSTEM REMACP HIB 1024 350 175 60 47 74 REMACP SYSTEM VAXsim_Monitor HIB 1024 200 100 350 210 1583 VAXSIM SYSTEM DBMS_MONITOR LEF 1000 512 150 62 62 488 DBMMON SYSTEM TINKERBELLE LEF 1024 350 175 325 177 1627 SYSTEM NULF COM 1024 350 250 350 246 1007 FAC HALL CFAI COM 2400 1024 512 662 358 567 CFAI VTXUP VTX_SERVER LEF 2400 1024 512 962 696 624 VTXSRV WEINSTEIN Jane LEF 2400 1024 512 662 432 13132 EDT HURWITZ HURWITZ LEF 2400 1024 512 512 350 4605 CARMODY CARMODY LEF 2400 1024 512 812 546 16822 MAIL CAPARILLIO CAPARILLIO CUR 2400 1024 512 512 282 10839 STRATFORD Kathy LEF 2400 1024 512 512 210 9852 FREY _VTA270: LEF 2400 1024 512 512 163 1021 CHRISTOPHER _VTA271: LEF 2400 1024 512 512 252 379 STANLEY STANLEY LEF 2048 1024 512 512 295 10369 MINSKY MINSKY LEF 2400 1024 512 512 143 60316 TESTGEN TESTGEN LEF 4100 1024 512 234 84 75753 CLAYMORE Cluster Buster LEF 2400 1024 512 1262 932 1919 CREATOR DINEAUX Sally LEF 2400 1024 512 512 330 31803 DECNET SERVER_0848 LEF 1024 350 175 325 183 647 NETSERVER LUZ Lars LEF 2400 1024 512 1024 980 95420 TEX DECNET MAIL_222 LEF 1024 350 175 325 234 526 MAIL STEVENS STEVENS LEF 2400 1024 512 512 221 7851 ZEN _VTA259: LEF 2400 1024 512 1024 319 4267 SHOW ZEN ZEN_2 LEF 2400 1024 512 512 171 3026)
|WS Deflt||Default working set size, which is reestablished at each image activation.|
|WS Size||Current size of the working set. When the number of pages actually allocated (Pages in WS) reaches this threshold, subsequent page faults will cause page replacement.|
|Pages in WS||Both private and global pages.|
|Threshold values to which WS Size can be adjusted.|
|Page faults||Total number of faults that have occurred since process creation.|
Because allocation time is not measured directly, you should be concerned with the rates of the two memory management activities that extend the processing time experienced by processes in a virtual memory system---namely, page faulting and swapping. These activities not only incur overhead on the CPU and disk resources, but they also block the execution of processes during the time the system needs to allocate memory and the time the processes spend waiting for memory allocation.
Thus, your goal in evaluating the memory resource is to ensure that
faulting and swapping rates are kept within reasonable bounds.
7.2.1 Page Faulting
Whenever a process references a virtual page that is not in its working
set, a page fault occurs. For process execution to continue, memory
management software is called to acquire and map a physical page into
the working set.
18.104.22.168 Hard and Soft Page Faults
The fault can be hard or soft. A hard fault (measured by the Page Read I/O Rate item in the MONITOR PAGE class) is one that requires a read operation from a page or image file on disk. A soft fault is one that is satisfied by mapping to a page already in memory; this can be a global page or a page in the secondary page cache. (The secondary page cache consists of the free-page and the modified-page list; the primary page cache is each process's working set.) The following categories of soft faults are measured and reported in the MONITOR PAGE class:
The total Page Fault Rate is equal to the sum of the hard fault rate (Page Read I/O Rate) plus the soft fault rate, which is the sum of the five categories listed above.
System Fault Rate is the rate of faults for which the referenced virtual address is in system space (hex address 80000000 and above). It is not included in the overall Page Fault Rate, and is discussed separately in Section 11.1.2.
Your own judgment, based on familiarity with the data in your MONITOR summaries, is the best determinant of an acceptable Page Fault Rate for your system.
When either of the following thresholds is exceeded, you may want to consider improving memory responsiveness. (See Section 11.1.)
Paging problems typically occur when the secondary page cache (free-page list and modified-page list) is too small. This systemwide cache, which is sized by AUTOGEN, should be large enough to ensure that the overall fault rate is not excessive and that most faults are soft faults.
When evaluating paging activity on your system, you should check for processes in the free page wait (FPG), collided page wait (COLPG), and page fault wait (PFW) states and note departures from normal figures. The presence of processes in the FPG state almost always indicates serious memory management problems, because it implies that the free-page list has been depleted.
Processes in the PFW and COLPG states are waiting for hard faults (from disk) to be satisfied. Note, however, that while hard fault waiting is undesirable, it is not as serious as swapping.
An average free-page list size that is between the values of the FREELIM and FREEGOAL system parameters usually indicates deficient memory and is often accompanied by a high page fault rate. If either condition exists, or if the hard fault rate exceeds the recommended percentage, you must consider enlarging the free- and modified-page lists, if possible. Enlarging the secondary page cache could reduce hard faulting, provided such faulting is not the result of image activation.
The easiest way to increase the free page cache is to increase the value of FREEGOAL. Active reclamation will then attempt to recover more memory from idle processes. Typically, overall fault rates decrease when active reclamation is enabled because memory is more readily available to active processes.
A high rate of modified-page writing, for example, as shown in the Page Write I/O Rate field of the MONITOR PAGE display, is an indication that the modified-page list might be too small. A write rate of 1 every 2 seconds is fairly high. The modified-page list should be large enough to provide an equilibrium between the rate at which pages are added to the list versus the modified-page list fault rate without causing excessive list writing by reaching MPW_HILIMIT. If you do adjust the size of the modified-page list using MPW_HILIMIT, make sure you retain the relationship among MPW_HILIMIT, MPW_WAITLIMIT, and MPW_LOWAITLIMIT by using AUTOGEN.
If you are able to increase the size of the free-page list, you can
then allocate more memory to the modified-page list. Using AUTOGEN, you
can increase the modified-page list by adjusting the appropriate MPW
system parameters. (See the OpenVMS System Management Utilities Reference Manual for a description of MPW
7.2.2 Swapping and Swapper Trimming
Swapping, when considered in isolation, is an expensive operation. It can place a huge transfer load on the I/O subsystem instantaneously. Swapping also can place heavy demand on CPU resources. However, when used as part of the active memory reclamation policy, swapping results in improved---that is, reduced---memory consumption and a lower page fault rate.
There is good swapping and bad swapping. The latter occurs as the last step of reactive memory reclamation when the free-page list is exhausted---that is, when it is smaller than FREELIM. However, having a significant number of outswapped processes on your system when active memory reclamation is enabled is not a cause for alarm. A much more reliable indicator that harmful swapping is occurring is a high inswap rate---for example, greater than one process per second.
Before attempting to improve a system with a high inswap rate, do the following:
A possible, although unlikely, reason for a high inswap rate might be
an overly large value for FREEGOAL when active memory reclamation is
enabled. Although this policy outswaps only long-waiting processes, a
very large value for FREEGOAL will cause the outswapping of many
long-waiting processes over time, thus increasing the inswap rate as
these processes become computable.
7.3 Analyzing the Excessive Paging Symptom
Whenever you detect paging or swapping on a system with degraded performance, you should investigate a memory limitation. If you observe a lack of free memory but no serious paging or swapping, the system may be just at the point where it will begin to experience excessive paging or swapping if demand grows any more.
In this case, you have a bit of advance warning, and you may want to
examine some preventive measures.
7.3.1 What Is Excessive Paging?
There are no universally applicable scales that rank page faulting rates from moderate to excessive.
Although the only good page faulting rate is zero page faults per
second, you need to think in terms of the maximum tolerable rate of
page faulting for your system.
Observe the following guidelines:
Once you have determined that the rate of paging is excessive, you need
to determine the cause. As Figure A-3 shows, you can begin by looking
at the number of image activations that have been occurring.
7.3.3 Excessive Image Activations
Use ACCOUNTING to examine the total number of images started.
|Image-level accounting is enabled and the value is in the low-to-normal range for typical operations at your site||The problem lies elsewhere.|
|Image-level accounting is NOT enabled||Check the display produced by the MONITOR PAGE command for demand zero faults.|
|50% of all page faults are demand zero faults||Image activations are too frequent.|
If image activations seem to be excessive, do the following:
You should characterize your page faulting. Paging from disk is hard paging, and it is the less desirable of the two.
Soft paging refers to paging from the page cache in main memory. Although soft paging is undesirable when it is excessive, it is normally much less costly to overall system performance than disk paging, simply because it is faster.