HP OpenVMS Systems Documentation
HP OpenVMS Programming Concepts Manual
23.22 Fast I/O and Fast Path Features (Alpha and I64 Only)
Fast I/O and Fast Path are two optional features that can provide improved I/O performance. Performance improvement is achieved by reducing the CPU cost per I/O request, and improving symmetric multiprocessing (SMP) scaling of I/O operations. The CPU cost per I/O is reduced by optimizing code for high-volume I/O and by using better SMP CPU memory cache. SMP scaling of I/O is increased by reducing the number of spinlocks taken per I/O and by substituting finer-granularity spinlocks for global spinlocks.
The improvements follow a division that already exists between the device-independent and device-dependent layers in the OpenVMS I/O subsystem. The device-independent overhead is addressed by Fast I/O, which is a set of system services that can substitute for certain $QIO operations. Using these services requires some coding changes in existing applications, but the changes are usually modest and well contained. The device-dependent overhead is addressed by Fast Path, which is an optional performance feature that creates a "fast path" to the device. It requires no application changes.
Fast I/O and Fast Path can be used independently. However, together
they can provide a reduction in CPU cost per I/O on uniprocessor and on
Fast I/O is a set of three system services, SYS$IO_SETUP, SYS$IO_PERFORM, and SYS$IO_CLEANUP, that were developed as an alternative to $QIO. These services are not a $QIO replacement; $QIO is unchanged, and $QIO interoperation with these services is fully supported. Rather, the services substitute for a subset of $QIO operations, namely, only the high-volume read/write I/O requests.
The Fast I/O services support 64-bit addresses for data transfers to and from disk and tape devices.
While Fast I/O services are available on OpenVMS VAX, the performance
advantage applies only to OpenVMS Alpha and OpenVMS I64. OpenVMS VAX
has a run-time library (RTL) compatibility package that translates the
Fast I/O service requests to $QIO system service requests, so one set
of source code can be used on VAX, Alpha, and I64 systems.
The performance benefits of Fast I/O result from streamlining high-volume I/O requests. The Fast I/O system service interfaces are optimized to avoid the overhead of general-purpose services. For example, I/O request packets (IRPs) are now permanently allocated and used repeatedly for I/O rather than allocated and deallocated anew for each I/O.
The greatest benefits stem from having user data buffers and user I/O status structures permanently locked down and mapped using system space. This allows Fast I/O to do the following:
In total, Fast I/O services eliminate four spinlock acquisitions per
I/O (two for the MMG spinlock and two for the SCHED spinlock). The
reduction in CPU cost per I/O is 20% for uniprocessor systems and 10%
for multiprocessor systems.
Buffer objects accomplish the lockdown of user-process data structures. Buffer objects are process entities that are associated with a process's virtual address range. When a buffer object is created, all its physical pages in its address range are locked in memory and can be double-mapped into system space. These locked pages in a process's address range cannot be freed until the buffer object has been deleted. The Fast I/O environment uses this feature by locking the buffer object itself during $IO_SETUP. This prevents the buffer object and its associated pages from being deleted. The buffer object is unlocked during $IO_CLEANUP, or at image rundown. After creating a buffer object, the process remains fully pageable and swappable and the process retains normal virtual memory access to its pages in the buffer object.
If the buffer object contains process data structures to be passed to an OpenVMS system service, the OpenVMS system can use the buffer object to avoid any probing, lockdown, and unlocking overhead associated with these process data structures. Additionally, if the buffer object has performed double-mapping into system space, this allows the OpenVMS system direct access to the process memory from system context.
To date, only the Fast I/O services are supported with buffer objects. For example, a buffer object allows a programmer to eliminate I/O memory management overhead. On each I/O, each page of a user data buffer is probed and then locked down on I/O initiation and unlocked on I/O completion. Instead of incurring this overhead for each I/O, it can be done once at buffer object creation time. Subsequent I/O operations involving the buffer object can completely avoid this memory management overhead.
The system space window buffer object allows several I/O related tasks to be performed entirely from system context at high IPL, without having to assume process context. When a buffer object is created, the system maps by default a section of system space (S2) to process pages associated with the buffer object. This protected system space window allows read and write access only from kernel mode. Because all of system space is equally accessible from within any context, it is now possible to avoid the context switch to assume the original user's process context. Optionally, the system space window can be in S0/S1 space, or it can be suppressed.
Two system services are used to create and delete buffer objects: SYS$CREATE_BUFOBJ_64 and SYS$DELETE_BUFOBJ. Both services can be called from any access mode. To create a buffer object, the SYS$CREATE_BUFOBJ_64 system service is called. This service expects as inputs an existing process memory range and returns a handle for the buffer object. The handle is an opaque identifier used to identify the buffer object on future requests. The SYS$DELETE_BUFOBJ system service is used to delete the buffer object and accepts as input the handle. Although image rundown deletes all existing buffer objects, it is good practice for the application to clean up properly.
Buffer objects require system management. Because buffer objects tie up physical memory, extensive use of buffer objects require system management planning. All the bytes of memory in the buffer object are deducted from the systemwide SYSGEN parameter MAXBOBMEM (maximum buffer object memory). System managers must set this parameter correctly for the application loads that run on their systems. Additionally, two other SYSGEN parameters MAXBOBS0S1 and MAXBOBS2 are available for system managers. MAXBOBS0S1 and MAXBOBS2, however, are now regarded as obsolete system parameters. Initially, the MAXBOBS0S1 and MAXBOBS2 parameters were intended to ensure that users could not adversely affect the system by creating hugh buffer objects. But as users began to use buffer objects more widely, managing the combination of these parameters proved to be too complex.
Now, users who want to create buffer objects must either hold the VMS$BUFFER_OBJECT_USER identifier or execute in executive or kernel mode. Therefore, these users are considered privileged applications, and the additional safeguard that these parameters provided is unnecessary.
To determine current usage of system memory resources, enter the following command:
Table 23-5 shows these three parameters and their meanings.
The MAXBOBMEM, MAXBOBS0S1, and MAXBOBS2 parameters default to 100 Alpha pages, but for applications with large buffer pools it can be set much larger. To prevent user-mode code from tying up excessive physical memory, user-mode callers of $CREATE_BUFOBJ_64 must have a new system identifier, VMS$BUFFER_OBJECT_USER, assigned. The system manager can assign this identifier with the DCL command SET ACL command to a protected subsystem or application that creates buffer objects from user mode. It may also be appropriate to grant the identifier to a particular user with the Authorize utility command GRANT/IDENTIFIER, for example, to a programmer who is working on a development system.
There are several buffer object restrictions which are listed as follows:
For complete information about using Fast I/O, the Fast I/O system services, and the buffer objects system services that are in the following list, see the HP OpenVMS I/O User's Reference Manual, and the HP OpenVMS System Services Reference Manual: A--GETUAI and the HP OpenVMS System Services Reference Manual: GETUTC--Z:
23.22.2 Fast Path (Alpha and I64 Only)
Like Fast I/O, Fast Path is an optional, high-performance feature designed to improve I/O performance. By restructuring and optimizing class and port device driver code around high-volume I/O code paths, Fast Path creates a streamlined path to the device. Fast Path is of interest to any application where enhanced I/O performance is desirable. Two examples are database systems and real-time applications, where the speed of transferring data to disk is often a vital concern.
Using Fast Path features does not require source-code changes. Minor interface changes are available for expert programmers who want to maximize Fast Path benefits.
At this time, Fast Path is not available on the OpenVMS VAX operating
Fast Path achieves performance gains by reducing CPU time for I/O requests on both uniprocessor and SMP systems. The performance benefits are produced by:
The performance improvement can best be seen by contrasting the current OpenVMS I/O scheme to the new Fast Path scheme. While transparent to an OpenVMS user, each disk and tape device is tied to a specific port interconnect. All I/O for a device is sent out over its assigned port. Under the current OpenVMS I/O scheme, a multiprocessor I/O can be initiated on any CPU, but I/O completion must occur on the primary CPU. Under Fast Path, all I/O for a given port is affinitized to a specific CPU, eliminating the requirement for completing the I/O on the primary CPU. This means that the entire I/O can be initiated and completed on a single CPU. Because I/O operations are no longer split among different CPUs, performance increases as memory cache thrashing between CPUs decreases.
Fast Path also removes a possible SMP bottleneck on the primary CPU. If the primary CPU must be involved in all I/O, then once this CPU becomes saturated, no further increase in I/O throughput is possible. Spreading the I/O load evenly among CPUs in a multiprocessor system provides greater maximum I/O throughput on a multiprocessor system.
With most of the I/O code path executing under port-specific spinlocks
and with each port assigned to a specific CPU, a scalable SMP model of
parallel operation exists. Given multiple port and CPUs, I/O can be
issued in parallel to a large degree.
For complete information about using Fast Path, see the HP OpenVMS I/O User's Reference Manual.
|Entry Point||System Service||Function|
|LIB$SYS_ASCTIM||$ASCTIM||Converts system time in binary form to ASCII text|
|LIB$SYS_FAO||$FAO||Converts a binary value to ASCII text|
|LIB$SYS_FAOL||$FAOL||Converts a binary value to ASCII text, using a list argument|
|LIB$SYS_GETMSG||$GETMSG||Obtains a system or user-defined message text|
|LIB$SYS_TRNLOG||$TRNLOG||Returns the translation of the specified logical name|
Two command language interpreters (CLIs) are available on the operating system: DCL and MCR. The run-time library provides several routines that provide access to the CLI callback facility. These routines allow your program to call the current CLI. In most cases, these routines are called from programs that execute as part of a command procedure. They allow the command procedure and the CLI to exchange information.
These routines call the CLI associated with the current process to perform the specified function. In some cases, however, a CLI is not present. For example, the program may be running directly as a subprocess or as a detached process. If a CLI is not present, these routines return the status LIB$_NOCLI. Therefore, you should be sure that these routines are called when a CLI is active. Table 24-2 lists the RTL routines that access the CLI.
|LIB$GET_FOREIGN||Gets a command line|
|LIB$DO_COMMAND||Executes a command line after exiting the current program|
|LIB$RUN_PROGRAM||Runs another program after exiting the current program (chain)|
|LIB$GET_SYMBOL||Returns the value of a CLI symbol as a string|
|LIB$DELETE_SYMBOL||Deletes a CLI symbol|
|LIB$SET_SYMBOL||Defines or redefines a CLI symbol|
|LIB$DELETE_LOGICAL||Deletes a supervisor-mode process logical name|
|LIB$SET_LOGICAL||Defines or redefines a supervisor-mode process logical name|
|LIB$DISABLE_CTRL||Disables CLI interception of control characters|
|LIB$ENABLE_CTRL||Enables CLI interception of control characters|
|LIB$ATTACH||Attaches a terminal to another process|
|LIB$SPAWN||Creates a subprocess of the current process|
The following routines execute only when the current CLI is DCL:
The LIB$GET_FOREIGN routine returns the contents of the command line that you use to activate an image. You can use it either to give your program access to the qualifiers of a foreign command or to prompt for further command line text.
A foreign command is a command that you can define and then use, as if it were a DCL or MCR command to run a program. When you use the foreign command at command level, the CLI parses the foreign command only and activates the image. It ignores any options or qualifiers that you have defined for the foreign command. Once the CLI has activated the image, the program can call LIB$GET_FOREIGN to obtain and parse the remainder of the command line (after the command itself) for whatever options it may contain.
The HP OpenVMS DCL Dictionary describes how to define a foreign command.
The action of LIB$GET_FOREIGN depends on the environment in which the image is activated:
The following PL/I example illustrates the use of the optional force-prompt argument to permit repeated calls to LIB$GET_FOREIGN. The command line text is retrieved on the first pass only; after this, the program prompts from SYS$INPUT.
EXAMPLE: ROUTINE OPTIONS (MAIN); %INCLUDE $STSDEF; /* Status-testing definitions */ DECLARE COMMAND_LINE CHARACTER(80) VARYING, PROMPT_FLAG FIXED BINARY(31) INIT(0), LIB$GET_FOREIGN ENTRY (CHARACTER(*) VARYING, CHARACTER(*) VARYING, FIXED BINARY(15), FIXED BINARY(31)) OPTIONS(VARIABLE) RETURNS (FIXED BINARY(31)), RMS$_EOF GLOBALREF FIXED BINARY(31) VALUE; /* Call LIB$GET_FOREIGN repeatedly to obtain and print subcommand text. Exit when end-of-file is found. */ DO WHILE ('1'B); /* Do while TRUE */ STS$VALUE = LIB$GET_FOREIGN (COMMAND_LINE,'Input: ',, PROMPT_FLAG); IF STS$SUCCESS THEN PUT LIST (' Command was ',COMMAND_LINE); ELSE DO; IF STS$VALUE ^= RMS$_EOF THEN PUT LIST ('Error encountered'); RETURN; END; PUT SKIP; /* Skip to next line */ END; /* End of DO WHILE loop */ END;
Assuming that this program is present as SYS$SYSTEM:EXAMPLE.EXE, you can define the foreign command EXAMPLE to invoke it, as follows:
$ EXAM*PLE :== $EXAMPLE
Note the optional use of the asterisk in the symbol name to denote an abbreviated command name. This permits the command name to be abbreviated as EXAM, EXAMP, EXAMPL or to be specified fully as EXAMPLE. See the HP OpenVMS DCL Dictionary for information about abbreviated command names.
Note that the use of the dollar sign ($) before the image name is required in foreign commands.
Now assume that a user runs the image by typing the foreign command and giving "subcommands" that the program displays:
$ EXAMP Subcommand 1 Command was SUBCOMMAND 1 Input: Subcommand 2 Command was SUBCOMMAND 2 Input: ^Z $
In this example, Subcommand 1 was obtained from the command line; the program prompts the user for the second subcommand. The program terminated when the user pressed the Ctrl/Z key sequence (displayed as ^Z) to indicate end-of-file.