HP OpenVMS Systems Documentation
DECamds User's Guide
5.2.2 Customizing Events
You can define criteria by which specific events are qualified for your attention. For example, you can refine the global filtering by also defining that DSKRWT event (high disk device Rwait count) must pass your specifications before being considered an event worth displaying or logging. To define specific event criteria, perform the following steps:
The following sections describe the event customization options.
Severity is the relative importance of an event. Events with a high severity must also exceed threshold settings before an event can be signaled for display or logging.
Each DECamds event is assigned an occurrence value, that is, the number of consecutive data samples that must exceed the event threshold before the event is signaled. By default, events have low occurrence values. However, you might find that a certain event only indicates a problem when it occurs repeatedly for an extended period. You can change the occurrence value assigned to that event so that DECamds signals it only when necessary.
For example, suppose page fault spikes are common in your environment, and DECamds frequently signals intermittent HITTLP, total page fault rate is high events. You could change the event's occurrence value to 3, so that the total page fault rate must exceed the threshold for three consecutive collection intervals before being signaled to the Event Log.
To avoid displaying insignificant events, you can customize an event so that DECamds signals it only when it continuously occurs.
Automatic Event Investigation (see Section 5.1.2) uses the occurrence value to determine when to further investigate an event. When enabled, the automatic event investigation is activated when the Occurrence count is three times the Occurrence setting value.
You can customize certain events so that the event threshold varies depending on the class of computer system the event occurs on. This feature is particularly useful in environments with many different types and sizes of computers.
By default, DECamds uses only one default threshold for each event, regardless of the type of computer the event occurs on. However, for certain events (in particular, CPU, I/O, and memory usage events) the level at which resource use becomes a problem depends on the size and type of computer. For example, a page fault rate of 100 may be important on a VAXstation 2000 system but not on a VAX 7000 system.
DECamds provides three additional predefined classes for CPU, I/O, and Memory-related events. You can specify threshold values for each class in addition to the default threshold for an event. To specify an additional event threshold for each class, edit the file AMDS$THRESHOLD_DEFS.DAT located in the AMDS$CONFIG directory.
Table 5-3 defines CPU, I/O, and Memory classes.
1If no class is defined, DECamds uses the default threshold value.
As an example of setting a class-based threshold, the HITTLP, total page fault rate is high event is a memory-related event, so the thresholds are based on the memory class definitions shown in Table 5-3. The default threshold for this event is 20 page faults per second. A page fault rate of 20 may be important on a VAXstation 2000 system, but it is not important on a VAX 7000 system. To account for this, you can specify the following additional thresholds for the HITTLP, total page fault rate is high event:
Threshold values are compared to an event's description to determine whether an event meets the criteria for display or log. Threshold values are used in conjunction with the occurrence and severity values. Increasing event threshold values can reduce CPU use and improve perceived response time because more instances must occur for the threshold to be crossed, so fewer thresholds are crossed and fewer events are triggered.
You can read a description of an event by choosing Customize Events from the Customize menu in the Event Log window, then double-clicking on the event. The Event Customization dialog box displays an Event Description field.
Most events are checked against only one threshold; however, some have dual thresholds, where the event is triggered if either one is true. For example, for the LOVLSP, node disk volume free space is low event, DECamds checks both of the following thresholds:
5.3 Sorting Data
Choose Sort Data... from the Customize menu to change the order of the information displayed in a window. A dialog box appears in which you can specify sort criteria. All sort criteria must be met for a process to be displayed.
You can sort data in the following windows:
Figure 5-6 shows a sample Memory Summary Sorting dialog box.
Figure 5-6 Memory Summary Sorting Dialog Box
Sorting is based on two variables: the sort order and the sort field. You can choose only one sort criterion for each variable---one for the sort order, and one for the sort field. To sort Memory Summary data to list the processes with the highest page fault rates first, for example perform the following steps:
5.4 Setting Collection Intervals
A collection interval is the time the Data Analyzer waits before requesting more information from Data Provider nodes. Changing the collection interval helps you control the performance of DECamds and its consumption of system resources.
The frequency of polling remote nodes for data (collection intervals) can affect perceived response time. You want to find a balance between collecting data often enough to detect potential resource availability problems before a node or cluster experiences a severe problem, and seldom enough to optimize perceived response time. Increasing the collection interval factor decreases CPU consumption and LAN load, but response time might appear slower because the intervals are longer.
Collection intervals do not affect memory use.
To change a collection interval, choose Collection Interval from the Customize menu. Figure 5-7 shows a sample Memory Summary Collection Interval dialog box.
Figure 5-7 Memory Summary Collection Interval Dialog Box
Table 5-4 describes the fields on the Memory Summary Collection Interval dialog box.
To apply the changes, click on OK or Apply. To save collection interval changes, choose Save Collection Interval Changes from the Customize menu.
To change back to DECamds default values for the window, click on Default. To exit without making any changes, click on Cancel.
Table 5-5 lists the default window collection interval values (in seconds) provided with DECamds for each window type.
1All times are in seconds and cannot be less than .5 second.
2Process Identification Manager supports the CPU, Memory, Process I/O, and Single Lock Summary window sampling.
5.5 Optimizing Performance with System Settings
DECamds is a compute-intensive and LAN traffic-intensive application. At times, routine data collection, display activities, and corrective actions can cause a delay in perceived response time.
This section explains how to optimize perceived response time based on actual measurements of CPU utilization rates (throughput). Performance improvements can be made in the following areas:
Site configurations vary widely, and no rules apply to all situations. However, the information in this section can help you make informed choices about improving your system performance.
The following factors affect perceived response time:
5.5.1 Optimizing DECamds Software
When DECamds starts, it polls the LAN to locate all nodes running the DECamds Data Provider, creates a communications link, and collects data from each Data Provider node on the LAN. (See Section 1.1 for more information about establishing a communications link between nodes.)
The initial polling process creates a short-term high load of CPU and LAN activity. After establishing a communications link with other nodes, DECamds reduces polling frequency, thereby reducing the CPU and LAN load.
The following sections describe system settings that you can change to
improve performance and the ability of DECamds to handle data
To improve the performance of DECamds, you might need to change process quotas. The quotas used extensively by DECamds are ASTLM, TQELM, BIOLM, BYTLM, and WSEXTENT. The values listed in Section A.2 are suggestions for a 50-node cluster.
The following process quotas are recommended:
1node count is the number of nodes a Data Analyzer monitors simultaneously.
Perform the following steps to change process quotas:
184.108.40.206 Setting LAN Load
The maximum size for data packets is 1500 bytes. When the amount of data is greater than 1500 bytes, DECamds must send multiple requests to complete the data collection request.
Table 5-6 shows the LAN load for various levels of collection intervals and data collection. You can modify a data collection window's collection intervals (as explained in Section 5.4) or reduce the scope of data collection (as explained in Section 5.1.1) to reduce LAN activity.