HP OpenVMS Systems Documentation
HP OpenVMS Programming Concepts Manual
4.4 Using Affinity and Capabilities in CPU Scheduling (Alpha and I64 Only)
On Alpha and I64 systems, the affinity and capabilities mechanisms allow CPU scheduling to be adapted to larger CPU configurations by controlling the distribution of processes or threads throughout the active CPU set. Control of the distribution of processes throughout the active CPU set becomes more important as higher-performance server applications, such as databases and real-time process-control environments, are implemented. Affinity and capabilities provide the user with opportunities to perform the following tasks:
4.4.1 Defining Affinity and Capabilities
The affinity mechanism allows a process, or each of its kernel threads,
to specify an exact set of CPUs on which it can execute. The
capabilities mechanism allows a process to specify a set of resources
that a CPU in the active set must have defined before it is allowed to
contend for process execution. Presently, both of these mechanisms are
present in the OpenVMS scheduling mechanism; both are used extensively
internally and externally to implement parts of the I/O and timing
subsystems. Now, however, the OpenVMS operating system provides user
access to these mechanisms.
It is important for users to understand that inappropriate and abusive
use of the affinity and capabilities mechanisms can have a negative
impact on the symmetric aspects of the current multi-CPU scheduling
Capabilities are resources assigned to CPUs that a process needs to execute correctly. There are four defined capabilities. They are restricted to internal system events or functions that control system states or functions. Table 4-6 describes the four capabilities.
4.4.3 Looking at User Capabilities
Previously, the use of capabilities was restricted to system resources and control events. However, it is also valuable for user functions to be able to indicate a resource or special CPU function.
There are 16 user-defined capabilities added to both the process and the CPU structures. Unlike the static definitions of the current system capabilities, the user capabilities have meaning only in the context of the processes that define them. Through system service interfaces, processes or individual threads of a multithreaded process, can set specific bits in the capability masks of a CPU to give it a resource, and can set specific bits in the kernel thread's capability mask to require that resource as an execution criterion.
The user capability feature is a direct superset of the current capability functionality. All currently existing capabilities are placed into the system capability set; they are not available to the process through system service interfaces. These system service interfaces affect only the 16 bits specifically set aside for user definition.
The OpenVMS operating system has no direct knowledge of what the
defined capability is that is being used. All responsibility for the
correct definition, use, and management of these bits is determined by
the processes that define them. The system controls the impact of these
capabilities through privilege requirements; but, as with the priority
adjustment services, abusive use of the capability bits could affect
the scheduling dynamic and CPU loads in an SMP environment.
The SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES system services provide access to the capability features. By using the SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES services, you can assign user capabilities to a CPU and to a specific kernel thread. Assigning a user capability to a CPU lasts either for the life of the system or until another explicit change is made. This operation has no direct effect on the scheduling dynamics of the system; it only indicates that the specified CPU is capable of handling any process or thread that requires that resource. If a process does not indicate that it needs that resource, it ignores the CPU's additional capability and schedules the process on the basis of other process requirements.
Assigning a user capability requirement to a specific process or thread has a major impact on the scheduling state of that entity. For the process or thread to be scheduled on a CPU in the active set, that CPU must have the capability assigned prior to the scheduling attempt. If no CPU currently has the correct set of capability requirements, the process is placed into a wait state until a CPU with the right configuration becomes available. Like system capabilities, user process capabilities are additive; that is, for a CPU to schedule the process, the CPU must have the full complement of required capabilities.
These services reference both sets of 16-bit user capabilities by the common symbolic constant names of CAP$M_USER1 through CAP$M_USER16. These names reflect the corresponding bit position in the appropriate capability mask; they are nonzero and self-relative to themselves only.
Both services allow multiple bits to be set or cleared, or both, simultaneously. Each takes as parameters a select mask and a modify mask that define the operation set to be performed. The service callers are responsible for setting up the select mask to indicate the user capabilities bits affected by the current call. This select mask is a bit vector of the ORed bit symbolic names that, when set, states that the value in the modify mask is the new value of the bit. Both masks use the symbolic constants to indicate the same bit; alternatively, if appropriate, you can use the symbolic constant CAP$K_USER_ALL in the select mask to indicate that the entire set of capabilities is affected. Likewise, you can use the symbolic constant CAP$K_USER_ADD or CAP$K_USER_REMOVE in the modify mask to indicate that all capabilities specified in the select mask are to be either set or cleared.
For information about using the SYS$CPU_CAPABILITIES and
SYS$PROCESS_CAPABILITIES system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and
HP OpenVMS System Services Reference Manual: GETUTC--Z.
There are two types of affinity: implicit and explicit. This section
Implicit affinity, sometimes known as soft affinity, is a variant form of the original affinity mechanism used in the OpenVMS scheduling mechanisms. Rather than require a process to stay on a specific CPU regardless of conditions, implicit affinity maximizes cache and translation buffer (TB) context by maintaining an association with the CPU that has the most information about a given process.
Currently, the OpenVMS scheduling mechanism already has a version of implicit affinity. It keeps track of the last CPU the process ran on and tries to schedule itself to that CPU, subject to a fairness algorithm. The fairness algorithm makes sure a process is not skipped too many times when it normally would have been scheduled elsewhere.
The Alpha architecture lends itself to maintaining cache and TB context that has significant potential for performance improvement at both the process and system level. Because this feature contradicts the normal highest-priority process-scheduling algorithms in an SMP configuration, implicit affinity cannot be a system default.
The SYS$SET_IMPLICIT_AFFINITY system service provides implicit affinity support. This service works on an explicitly specified process or kernel thread block (KTB) through the pidadr and prcnam arguments. The default is the current process, but if the symbolic constant CAP$K_PROCESS_DEFAULT is specified in pidadr, the bit is set in the global default cell SCH$GL_DEFAULT_PROCESS_CAP. Setting implicit affinity globally is similar to setting a capability bit in the same mask, because every process creation after the modification picks up the bit as a default that stays in effect across all image activations.
The protections required to invoke SYS$SET_IMPLICIT_AFFINITY depend on
the process that is being affected. Because the addition of implicit
affinity has the same potential as the SYS$ALTPRI service for affecting
the priority scheduling of processes in the COM queue, ALTPRI
protection is required as the base which all modification forms of the
serve must have to invoke SYS$SET_IMPLICIT_AFFINITY. If the process is
the current one, no other privilege is required. To affect processes in
the same UIC group, the GROUP privilege is required. For any other
processes in the system, the WORLD privilege is required.
Even though capabilities and affinity overlap considerably in their functional behavior, they are nonetheless two discrete scheduling mechanisms. Affinity, the subsetting of the number of CPUs on which a process can execute, has precedence over the capability feature and provides an explicit binding operation between the process and CPU. It forces the scheduling algorithm to consider only the CPU set it requires, and then applies the capability tests to see whether any of them are appropriate.
Explicit affinity allows database and high-performance applications to segregate application functions to individual CPUs, providing improved cache and TB performance as well as reducing context switching and general scheduling overhead. During the IPL 8 scheduling pass, the process is investigated to see to which CPUs it is bound and whether the current CPU is one of those. If it passes that test, capabilities are also validated to allow the process to context switch. The number of CPUs that can be supported is 32.
The SYS$PROCESS_AFFINITY system service provides access to the explicit affinity functionality. SYS$PROCESS_AFFINITY resolves to a specific process, defaulting to the current one, through the pidadr and prcnam arguments. Like the other system services, the CPUs that are affected are indicated through select_mask, and the binding state of each CPU is specified in modify_mask.
Specific CPUs can be referenced in select_mask and modify_mask using the symbolic constants CAP$M_CPU0 through CAP$M_CPU31. These constants are defined to match the bit position of their associated CPU ID. Alternatively, specifying CAP$K_ALL_ACTIVE_CPUS in select_mask sets or clears explicit affinity for all CPUs in the current active set.
Explicit affinity, like capabilities, has a permanent process as well as current image copy. As each completed image is run down, the permanent explicit affinity values overwrite the running image set, superseding any changes that were made in the interim. Specifying CAP$M_FLAG_PERMANENT in the flags parameter indicates that both the current and permanent processes are to be modified simultaneously. As a result, unless explicitly changed again, this operation has a scope from the current image through the end of the process life.
For information about the SYS$SET_IMPLICIT_AFFINITY and
SYS$PROCESS_AFFINITY system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and
HP OpenVMS System Services Reference Manual: GETUTC--Z.
The class scheduler gives you the ability to limit the amount of CPU time that a system's users may receive by placing the users into scheduling classes. Each class is assigned a percentage of the overall system's CPU time. As the system runs, the combined set of users in a class are limited to the percentage of CPU execution time allocated to their class. The users may get some additional CPU time if the qualifier /WINDFALL is enabled for their scheduling class. Enabling the qualifier /WINDFALL allows the system to give a small amount of CPU time to a scheduling class when a CPU is idle and the scheduling class's allotted time has been depleted.
To invoke the class scheduler, you use the SYSMAN interface. SYSMAN allows a user to create, delete, modify, suspend, resume, and display scheduling classes. Table 4-7 shows the SYSMAN command, class_schedule, and its subcommands.
4.5.1 Specifications for the Class_Schedule Command
The full specifications for Class_Schedule and its subcommands are as
The format for the Add subcommand is as follows:
The class name is the name of the scheduling class. It must be specified and the maximum length for this name is 16 characters.
Table 4-8 shows the qualifiers and their meanings for this SYSMAN command.
18.104.22.168 The Delete Subcommand
The format for the Delete subcommand is as follows:
The Delete subcommand deletes the scheduling class from the class
scheduler database file, and all processes that are members of this
scheduling class are no longer class scheduled.
The format for the Modify subcommand is as follows:
The Modify subcommand changes the characteristics of a scheduling class. The qualifiers are the same qualifiers as for the add subcommand. To remove a time restriction, specify a zero (0) for the time percentage associated with a particular range of hours.
To remove a name or uic value, you must specify a minus sign in front
of each name or value.
The format for the Show subcommand is as follows:
Table 4-9 shows the qualifiers and their meanings for this SYSMAN command.
22.214.171.124 The Suspend Subcommand
The format for the Suspend subcommand is as follows:
The Suspend subcommand suspends the specified scheduling class. All
processes that are part of this scheduling class remain as part of this
scheduling class but are granted unlimited CPU time.
The format of the Resume subcommand is as follows:
The Resume subcommand complements the suspend command. You use this
command to resume a scheduling class that is currently suspended.
The class scheduler database is a permanent database that allows
OpenVMS to class schedule processes automatically after a system has
been booted and rebooted. This database resides on the system disk in
SYS$SYSTEM: VMS$CLASS_SCHEDULE.DATA. SYSMAN creates this file as an RMS
indexed file when the first scheduling class is created by the SYSMAN
command, class_schedule add.
By using a permanent class scheduler, a process is placed into a scheduling class, if appropriate, at process creation time. When a new process is created, it needs to be determined whether this process belongs to a scheduling class. Since to determine this relies upon data in the SYSUAF file, and the Loginout image already has the process' information from this file, Loginout class schedules the process if it determines that the process belongs to a scheduling class.
There are two other types of processes to consider during process creation: subprocess and detached process. A subprocess becomes part of the same scheduling class as the parent process, even though it may not match the class's criteria. That is, its user and account name and/or UIC may not be part of the class's record. A detached process only joins a scheduling class if it executes the Loginout image (Loginout.exe) during process creation.