HP OpenVMS Systems Documentation
OpenVMS Performance Management
7.4.6 Large Waiting Processes not Outswapped
While using the SHOW SYSTEM command to look for large processes that are compute bound, you may, instead, observe that one or more large processes are hibernating or are in some other wait state. Possibly, swapping has been disabled for these processes. Use the SHOW PROCESS/CONTINUOUS command for each process to determine if any inactive process escapes outswapping. As a next step, invoke the System Dump Analyzer (SDA) with the DCL command ANALYZE/SYSTEM to see if the process status line produced by the SDA command SHOW PROCESS reveals the process status of PSWAPM.
If you find a process that is not allowed to swap, yet apparently consumes a large amount of memory when it is inactive, you may conclude that swapping should be enabled for it. Enabling swapping will give other processes a more equitable chance of using memory when memory is scarce and the large process is inactive. Discuss your conclusions with the owner of the process to determine if there are valid reasons why the process must not be swapped. (For example, most real-time processes should not be swapped.) If the owner of the process agrees to enable the process for swapping, use the DCL command SET PROCESS/SWAPPING (which requires the PSWAPM privilege). See the discussion of enabling swapping for all other processes in Section 11.18.
If the offending process is a disk ACP (ODS--1 only), set the system
parameter ACP_SWAPFLGS appropriately and reboot the system. See the
discussion about enabling swapping for disk ACPs in Section D.3.
If the data you collected with the F$GETJPI lexical function reveals that the working set counts (the actual memory consumed by the processes) are not particularly large, you may have too many processes attempting to run concurrently for the memory available. If they are and the problem persists, you may find that performance improves if you reduce the system parameter MAXPROCESSCNT, which specifies the number of processes that can run concurrently. See the discussion about reducing the number of concurrent processes in Section 11.19.
However, if MAXPROCESSCNT already represents the number of users who
must be guaranteed access to your system at one time, reducing
MAXPROCESSCNT is not a viable alternative. Instead, you must explore
other ways to reduce demand (redesign your application, for example) or
add memory. See the discussion about reducing demand or adding memory
in Section 11.26.
For the processes that seem to use the most memory, use the SHOW
PROCESS/CONTINUOUS command to check if the processes are operating in
the WSEXTENT region; that is, their working set sizes range between the
values of WSQUOTA and WSEXTENT. If not, it may be beneficial to
increase the values of BORROWLIM, GROWLIM, or both. Increasing both
BORROWLIM and GROWLIM discourages loans when memory is scarce. By
judiciously increasing these values, you will curtail the rate of loans
to processes with the largest working sets, particularly during the
times when the work load peaks. See the discussion about discouraging
working set loans in Section 11.20.
If memory is insufficient to support all the working set sizes of active processes, ineffective swapper trimming may be the cause.
In this case, the value of SWPOUTPGCNT may be too large. Compare the
value of SWPOUTPGCNT to the actual working set counts you observe. If
you decide to reduce SWPOUTPGCNT, be aware that you will increase the
amount of memory reclaimed every time second-level trimming is
initiated. Still, this is the parameter that most effectively converts
a system from a swapping system to a paging one and vice versa. As you
lower the value of SWPOUTPGCNT, you run the risk of introducing
excessive paging. If this situation occurs and you cannot achieve a
satisfactory balance between swapping and paging, you must reduce
demand or add memory. See the discussion about reducing demand or
adding memory in Section 11.26.
If you conclude that SWPOUTPGCNT is not too large, then you have already determined that the working sets are fairly large but not above quota and that few processes are computable. You will probably discover that one or more of the following conditions exist:
The first two conditions can be determined from information you have collected. However, if you suspect that too many users have used the DCL command SET PROCESS/NOSWAPPING to prevent their processes from being outswapped (even when not computable), you need to invoke the F$GETJPI lexical function for suspicious processes. (Suspicious processes are those that remain in the local event flag wait state for some time while the system is swapping heavily. You can observe that condition with the SHOW SYSTEM command.) If the flag PSWAPM in the status field (STS) is on, the process cannot be swapped. (The documentation for the system service $GETJPI specifies the status flags. See the OpenVMS System Services Reference Manual).
As an alternative, you can use the ANALYZE/SYSTEM command to invoke SDA to enter the SHOW PROCESS command for the suspicious processes. Those that cannot be swapped will include the designation PSWAPM in the status line at the top of the display.
If you determine that one or more processes should be allowed to swap,
you should seek agreement and cooperation from the users. (If agreement
is reached but users do not follow through, you could remove the users'
PSWAPM or SETPRV privileges with the /PRIVILEGES qualifier of
AUTHORIZE.) See the discussion about enabling swapping for all other
processes in Section 11.18.
If you find that a large number of processes are computable at this point in your investigation, you should ensure that disk thrashing is not initiated by the outswapping of processes while they are computing. Disk thrashing, in this case, is the outswapping of processes rapidly followed by the inswapping of the same processes.
Processes in the COMO state on the MONITOR STATES display are normally those that have finished waiting for a local event flag and are ready to be inswapped. On a system without swapping, they are new processes. However, you may find computable outswapped processes that were swapped out while they were computable. Such undesirable swapping is harmful if it occurs too frequently.
A particular work load problem must exist to provoke this situation. Suppose a number of compute-bound processes attempt to run concurrently. The processes will not be changing states while they compute. Moreover, since they are computing, they escape second-level swapper trimming to the SWPOUTPGCNT value. This condition can result in memory becoming scarce, which then could force the processes to begin swapping in and out among themselves. Whenever an outswapped process becomes computable, the scheduler is awakened to begin rescheduling. A process that is outswapped while it is computable also prompts immediate rescheduling. Thus, if the processes cannot gain enough processing time from the CPU before being outswapped and, if they are outswapped while they are computable, thrashing occurs.
If you enter the SHOW SYSTEM command and note that many of the computable outswapped processes are at their base priority, you should check to be sure that the processes are not being swapped out while they are computable. (The fact that the processes are at their base priority implies they have been attempting to run for some time. Moreover, a number of COMO processes at base priority strongly suggests that there is contention for memory among computable processes.)
You can enter the SHOW PROCESS/CONTINUOUS command for the COM processes and observe whether they fail to enter the LEF state before they enter the COMO state. Alternatively, observe whether their direct and buffered I/O rates remain low. Low I/O rates also imply that the processes have seldom gone into a local event flag wait state.
If you observe either indication that processes are being outswapped while computable, it is probable that too many highly computational processes are attempting to run concurrently or that DORMANTWAIT is set too low. However, before you attempt to adjust the rate at which processes are inswapped, you should rule out the possible effects of too many batch jobs running at the same time.
Enter the DCL command SHOW SYSTEM/BATCH to determine the number of batch jobs running concurrently and the amount of memory they consume. If you conclude that the number of concurrent batch jobs could be affecting performance, you can reduce the demand they create by modifying the batch queues with the /JOB_LIMIT qualifier. Include this qualifier on the DCL command you use to establish the batch queue (INITIALIZE/QUEUE or START/QUEUE).
If you have ruled out any possible memory contention from large
concurrent batch jobs, you can conclude that the solution involves
correcting the frequency at which the system outswaps then inswaps the
computable processes. Assuming the system parameter QUANTUM represents
a suitable value for all other work loads on the system, you can draw
the second conclusion. If you find the current priorities of the
compute-bound processes are less than or equal to DEFPRI, you should
consider increasing the special parameter SWPRATE so that inswapping of
compute-bound processes occurs less frequently. In that way, the
computing processes will have a greater amount of time to run before
they are outswapped to bring in the COMO processes. See the discussion
about reducing the rate of inswapping in Section 11.22.
If you have found a large number of computable processes that are not at their base priority and if their working sets are fairly large yet not above their working set quotas, you should investigate whether any real paging is occurring. Even when there is no real paging, there can be paging induced by swapping activity. You can identify paging due to swapping whenever a high percentage of all the paging is due to global valid page faults. Use the display produced by the MONITOR PAGE command to evaluate the page faulting.
If you conclude that most of the paging is due to swapper activity,
your system performance can improve if you induce some real paging by
decreasing the working set sizes, an action that can reduce swapping.
To induce paging, you can also reduce the automatic working set
adjustment growth by lowering WSINC or increasing PFRATH. See the
discussion about reducing paging to induce swapping in Section 11.23.
If you reach this point in the investigation and still experience
swapping in combination with degraded performance, you have ruled out
all the appropriate ways for tuning the system to reduce swapping. The
problem is that the available memory cannot meet demand.
If your system seems to run low on free memory at times, it is a warning that you are likely to encounter paging or swapping problems. You should carefully investigate your capacity and anticipated demand.
7.5.1 Reallocating Memory
Before you decide to order more memory, look at how you have allocated memory. See Figure A-11. You may benefit by adjusting physical memory utilization so that the page cache is larger and there is less disk paging. To make this adjustment, you might have to relinquish some of the total working set space.
If working set space has been too generously configured in your system,
you have found an important adjustment you can make before problems
arise. Section 11.5 describes how to decrease working set quotas and
working set extents.
Use the following MONITOR commands to obtain the appropriate statistic:
See Table B-1 for a summary of MONITOR data items.
|Component||Elapsed Time (%)||Greatest Influencing Factors|
|I/O Preprocessing||4||Host CPU speed|
|Controller Delay||2||Time needed to complete controller optimizations|
Disk actuator speed
Relative seek range
Rotational speed of disk
Data density and rotational speed
|I/O Postprocessing||4||Host CPU speed|
Note that the CPU time required to issue a request is only 8% of the elapsed time and that the majority of the time (for 4- to 8-block transfers) is spent performing a seek and waiting for the desired blocks to rotate under the heads. Larger transfers will spend a larger percentage of time in the transfer time stage. It is easy to see why I/O-bound systems do not improve by adding CPU power.
Other factors can greatly affect the performance of the I/O subsystem. Controller cache can greatly improve performance by allowing the controller to prefetch data and maintain that data in cache. For an I/O profile that involves a good percentage of requests that are physically close to one another or multiple requests for the same blocks, a good percentage of the requests can be satisfied without going directly to the disk. In this case the overall service time for the I/O falls dramatically, while the CPU use remains fixed, thus CPU time will represent a much larger percentage of the overall elapsed time of the I/O.
In the same situations, software caching mechanisms that use system memory (VIOC, for example) will have positive effects on performance as well. Since the mechanism is implemented in system software, CPU utilization will increase slightly, but satisfying the I/O in system memory can offer major performance advantages over even caching controllers.
Smart controllers also offer another advantage. By maintaining an internal queue, they can reorder the pending requests to minimize head movement and rotational delays and thus deliver greater throughput. Of course this strategy depends upon there being a queue to work with, and the deeper the queue, the greater the savings realized.
I/O service can be optimized at the application level by using RMS
global buffers to share caches among processes. This method offers the
benefit of satisfying an I/O request without issuing a QIO; whereas
system memory software cache (that is, VIOC) is checked to satisfy QIOs.
8.1.2 Disk Capacity and Demand
In evaluating disk capacity in a performance context, the primary
concern is not the total amount of disk space available but the speed
with which I/O operations can be completed. This speed is determined
largely by the time it takes to access the desired data blocks (seek
time and rotational delay) and by the data transfer capacity
(bandwidth) of the disk drives and their controllers.
18.104.22.168 Seek Capacity
Overall seek capacity is determined by the number of drives (and hence,
seek arms) available. Because most disk drives can be executing a seek
operation simultaneously with those of other disk drives, the more
drives available, the more parallelism you can obtain.
22.214.171.124 Data Transfer Capacity
A data transfer operation requires a physical data channel---the path from a disk through a controller, across buses, to memory. Data transfer on one channel can occur concurrently with data transfer on other channels. For this reason, it is a good idea to attempt to locate disks that have large data transfer operations on separate channels.
You can also increase available memory for cache, or use RMS global
buffers to share caches among processes.
Demand placed on the disk resource is determined by the user work load and by the needs of the system itself. The demand on a seek arm is the number, size (distance), and arrival pattern of seek requests for that disk. Demand placed on a channel is the number, size, and arrival pattern of data transfer requests for all disks attached to that channel.
In a typical timesharing environment, 90% of all I/O transfers are
smaller than 16 blocks. Thus, for the vast majority of I/O operations,
data transfer speed is not the key performance determinant; rather, it
is the time required to access the data (seek and rotational latency of
the disk unit).
For this reason, the factor that typically limits performance of the
disk subsystem is the number of I/O operations it can complete per unit
of time, rather than the data throughput rate. One exception to this
rule is swapping I/O, which uses very large transfers. Certain
applications, of course, can also perform large data transfers; MONITOR
does not provide information about transfer size, so it is important
for you to gain as much information as possible about the I/O
requirements of applications running on your system. Knowing whether
elevated response times are the result of seek/rotational delays or
data transfer delays provides a starting point for making improvements.
8.2 Evaluating Disk I/O Responsiveness
The principal measure of disk I/O responsiveness is the average response time of each disk. While not provided directly by MONITOR, it can be estimated using the I/O Operation Rate and I/O Request Queue Length items from the DISK class.
Because for each disk the total activity from all nodes in the OpenVMS Cluster is of primary interest, all references to disk statistics will be to the Row Sum column of the MONITOR multifile summary instead of the Row Average.