HP OpenVMS Systems Documentation
OpenVMS Version 7.3 Release Notes
5.9.17 Multipath Failover Fails Infrequently on HSZ70/HSZ80 Controllers
Under heavy load, a host-initiated manual or automatic path switch from one controller to another may fail on an HSZ70 or HSZ80 controller. Testing has shown this to occur infrequently.
5.9.18 SCSI Multipath Incompatibility with Some Third-Party Products
OpenVMS Alpha Version 7.2 introduced the SCSI multipath feature, which provides support for failover between the multiple paths that can exist between a system and a SCSI device.
This SCSI multipath feature may be incompatible with some third-party disk caching, disk shadowing, or similar products. Compaq advises you to avoid the use of such software on SCSI devices that are configured for multipath failover (for example, SCSI devices that are connected to HSZ70 and HSZ80 controllers in multibus mode) until this feature is supported by the manufacturer of the software.
Third-party products that rely on altering the Driver Dispatch Table (DDT) of the OpenVMS Alpha SCSI disk class driver (SYS$DKDRIVER.EXE) may require changes to work correctly with the SCSI multipath feature. Manufacturers of such software can contact Compaq at firstname.lastname@example.org for more information.
For more information about OpenVMS Alpha SCSI multipath features, see
Guidelines for OpenVMS Cluster Configurations.
Attempts to add a Gigabit Ethernet node to an OpenVMS Cluster system over a Gigabit Ethernet switch will fail if the switch does not support autonegotiation. The DEGPA enables autonegotiation by default, but not all Gigabit Ethernet switches support autonegotiation. For example, the current Gigabit Ethernet switch made by Cabletron does not.
Furthermore, the messages that are displayed may be misleading. If the node is being added using CLUSTER_CONFIG.COM and the option to install a local page and swap disk is selected, the problem may look like a disk-serving problem. The node running CLUSTER_CONFIG.COM displays the message "waiting for node-name to boot," while the booting node displays "waiting to tune system." The list of available disks is never displayed because of a missing network path. The network path is missing because of the autonegotiation mismatch between the DEGPA and the switch.
To avoid this problem, disable autonegotiation on the new node's DEGPA, as follows:
5.9.20 DQDRIVER Namespace Collision Workaround
Multiple systems in a cluster could each have IDE, ATA, or ATAPI devices potentially sharing the following names: DQA0, DQA1, DQB0, and DQB1.
Such sharing of device names could lead to confusion or errors. Starting with OpenVMS Version 7.2-1, you can avoid this problem by creating devices with unique names.
To create a list of uniquely named devices on your cluster, use the following procedure:
Following is a sample SYS$SYSTEM:SYS$DEVICES.DAT file (for node ACORN::):
This procedure causes all DQ devices to be named according to the following format, which allows for unique device names across the cluster:
node-name is the system name.
Port allocation classes are described in the OpenVMS Cluster Systems manual, where this technique is fully documented.
You have the option of using a nonzero port allocation class in the SYS$DEVICES.DAT file. However, if you use nonzero port allocation classes, be sure to follow the rules outlined in the OpenVMS Cluster Systems manual.
If you attempt to use the DCL command $INITIALIZE to initialize an IDC hard drive on a remote system using the mass storage control protocol (MSCP) server, you may receive a warning message about the lack of a bad block file on the volume. You can safely ignore this warning message.
Additionally, previously unused drives from certain vendors contain
factory-written data that mimics the data pattern used on a head
alignment test disk. In this case, the OpenVMS software will not
initialize this disk remotely. As a workaround, initialize the disk
from its local system. Note that this workaround also avoids the bad
block file warning message.
Multiple-switch Fibre Channel fabrics are supported by OpenVMS, starting with the DEC-AXPVMS-VMS721_FIBRECHAN-V0200 remedial kit. However, a significant restriction existed in the use of Volume Shadowing for OpenVMS in configurations with a multiple-switch fabric. All Fibre Channel hosts that mounted the shadow set had to be connected to the same switch, or all the Fibre Channel shadow set members had to be connected to the same switch. If the Fibre Channel host or shadow set member was connected to multiple fabrics, then this rule had to be followed for each fabric.
Changes have been made to Volume Shadowing for OpenVMS in OpenVMS
Version 7.3 that remove these configuration restrictions. These same
changes are available for OpenVMS Versions 7.2-1 and 7.2-1H1 in the
current version-specific Volume Shadowing remedial kits.
This problem is corrected in OpenVMS Alpha Version 7.2-1H1 and OpenVMS
Alpha Version 7.3 with a new system parameter, NPAGECALC, which
automatically calculates a value for NPAGEVIR and NPAGEDYN based on the
amount of physical memory in the system.
The problem of Fibre Channel adapters being off line after a system boot has been corrected in the following versions:
5.9.24 SHOW DEVICE Might Fail in Large Fibre Channel Configurations
The problem of SHOW DEVICE failing in large Fibre Channel configurations has been corrected in the following versions:
Before the correction, SHOW DEVICE might fail with a virtual address space full error on systems with more than 2400 unit control blocks (UCBs). In multipath SCSI and FC configurations, there is a UCB for each path from the host to every storage device.
Note that any procedure that calls SHOW DEVICE, such as CLUSTER_CONFIG,
can also experience this problem.
The problem of boot failure with the KGPSA loopback connector has been corrected in the following versions:
Before the correction, the system failed to boot if there was a KGPSA in the system with a loopback connector installed. The loopback connector is the black plastic protective cover over the GLMs/fiber-optic ports of the KGPSA.
If possible, install OpenVMS Alpha Version 7.2-1 and the current FIBRE_SCSI update kit for OpenVMS Alpha Version 7.2-1 kit before installing the KGPSA in your system.
If the KGPSA is installed on your system and the current FIBRE_SCSI update kit for OpenVMS Alpha Version 7.2-1 is not installed, you can connect the KGPSA to your Fibre Channel storage subsystem and then boot OpenVMS.
If you are not ready to connect the KGPSA to your Fibre Channel storage subsystem, you can do either of the following:
If you attempt to boot OpenVMS when a KGPSA is installed with the
loopback connector still attached, the system hangs early in the boot
process, at the point when it should configure the Fibre Channel
Enclosing a Fibre Channel path name in quotation marks is valid, starting in OpenVMS Alpha Version 7.3.
Prior to OpenVMS Version 7.3, the documentation and help text indicated that a path name could be enclosed in quotation marks, for example:
In versions of the system prior to OpenVMS Version 7.3, this command fails with the following error:
To prevent this problem on systems running prior versions of OpenVMS, omit the quotation marks that surround the path identifier string, as follows:
5.9.27 Reconfigured Fibre Channel Disks Do Not Come On Line
The following problem is corrected in OpenVMS Alpha Version 7.2-1H1 and OpenVMS Alpha Version 7.3.
Each Fibre Channel device has two identifiers on the HSG80. The first is the logical unit number (LUN). This value is established when you use the command ADD UNIT Dnnn on the HSG80, where nnn is the LUN value. The LUN value is used by the HSG80 to deliver packets to the correct destination. The LUN value must be unique on the HSG80 subsystem. The second identifier is the device identifier. This value is established when you use the following command, where nnnnn is the device identifier:
The device identifier is used in the OpenVMS device name and must be
unique in your cluster.
In OpenVMS Alpha Version 7.2-1H1 and OpenVMS Alpha Version 7.3,
assigning a device identifier to the HSG CCL is optional. If you do not
assign one, OpenVMS will not configure the $1$GGA device but will
configure the other devices on the HSG subsystem.
The problem described in this note is corrected in OpenVMS Alpha Version 7.3 by giving preference to the current path, thereby avoiding path switching after a transient error.
Every I/O error that invokes mount verification causes the multipath
failover code to search for a working path. In earlier versions of
OpenVMS, the multipath algorithm started with the primary path (that
is, the first path configured by OpenVMS) and performed a search,
giving preference to any direct paths to an HSx controller that has the
device on line. Before the correction, the algorithm did not test the
current path first, and did not stay on that path if the error
condition had cleared.
The release notes in this section pertain to the OpenVMS Registry.
Removing the data transfer size restrictions on the OpenVMS NT Registry required a change in the communication protocol used by the Registry. The change means that components of the Registry (the $REGISTRY system service and the Registry server) in OpenVMS V7.3 are incompatible with their counterparts in OpenVMS V7.2-1.
If you plan to run a cluster with mixed versions of OpenVMS, and you plan to use the $REGISTRY service or a product that uses the $REGISTRY service (such as Advanced Server, or COM for OpenVMS) then you are restricted to running these services on the OpenVMS V7.3 nodes only, or on the V7.2-1 nodes only, but not both.
If you need to run Registry services on both V7.3 and V7.2-1 nodes in
the same cluster, please contact your Compaq Services representative.
The backup and restore of the OpenVMS NT Registry database is discussed in the OpenVMS Connectivity Developer Guide. Compaq would like to stress the importance of periodic backups. Database corruptions are rare, but have been exposed during testing of previous versions of OpenVMS with databases larger than 2 megabytes. A database of this size is itself rare; initial database size is 8 kilobytes. The corruptions are further isolated by occurring only when rebooting an entire cluster.
The Registry server provides a way of backing up the database automatically. By default, the Registry server takes a snapshot of the database once per day. However, this operation is basically a file copy and, by default, it purges the copies to the most recent five. It is conceivable that a particular area of the database may become corrupted and Registry operations will continue as long as applications do not access that portion of the database. This means that the daily backup could in fact be making a copy of an already corrupt file.
To safeguard against this, Compaq recommends that you take these additional steps:
It should also be noted that in previous versions of OpenVMS, the EXPORT command may have failed to complete the operation under some conditions. You could normally recover simply by re-invoking the REG$CP image and retrying the operation until it was successful.
In addition, in previous versions of OpenVMS, the IMPORT command failed
to properly import keys with classnames or links. The only way to
recover from this was to modify the keys to add in the classnames or
links, or to recreate the keys in question.
The OpenVMS virtual I/O cache (VIOC) and the extended file cache (XFC) are file-oriented disk caches that can help to reduce I/O bottlenecks and improve performance. (Note that the XFC appears on Alpha systems beginning with Version 7.3.) Cache operation is transparent to application software. Frequently used file data can be cached in memory so that the file data can be used to satisfy I/O requests directly from memory rather than from disk.
Prior to Version 7.0, when an I/O was avoided because the data was returned from the cache, the direct I/O (DIO) count for the process was not incremented because the process did not actually perform an I/O operation to a device. Starting with Version 7.0, a change was made to cause all I/O requests---even those I/Os that were actually avoided because of data being returned from the cache---to be counted as direct I/Os.
This change can be a potential cause for confusion when you are
comparing application performance data on different versions of
OpenVMS. Applications running on Version 7.0 and later may appear to be
performing more I/O than they did when run on earlier versions, even
though the actual amount of I/O to the disk remains the same.
The Point-to-Point utility (PPPD) initiates and manages a Point-to-Point Protocol (PPP) network connection and its link parameters from an OpenVMS Alpha host system.
A chapter in the OpenVMS System Management Utilities Reference Manual: M--Z describes the PPPD commands with their
parameters and qualifiers, which support PPP connections.
At certain instances the queue journal file (SYS$QUEUE_MANAGER.QMAN$JOURNAL) may grow to a large size (over 500,000 blocks), especially if there is a very large volume of queue activity. This may cause either a long boot time or the display of an error message, QMAN-E-NODISKSPACE , in the OPERATOR.LOG. The long boot time is caused by the queue manager needing a large space to accommodate the queue journal file.
The following example shows the error messages displayed in the operator.log:
You can shrink the size of the journal file by having a privileged user issue the following DCL command:
Executing this DCL command check points the queue journal file and shrinks the file to the minimum size required for queue system operation.
Until this problem is fixed, use this workaround to shrink the size to
a small file.
The following release notes pertain to RMS Journaling for OpenVMS.
Prior to Version 7.2, recovery unit (RU) journals were created temporarily in the [SYSJNL] directory on the same volume as the file that was being journaled. The file name for the recovery unit journal had the form RMS$process_id (where process_id is the hexadecimal representation of the process ID) and a file type of RMS$JOURNAL.
The following changes have been introduced to RU journal file creation in OpenVMS Version 7.2:
These changes reduce the directory overhead associated with journal file creation and deletion.
The following example shows both the previous and current versions of journal file creation:
Previous versions: [SYSJNL]RMS$214003BC.RMS$JOURNAL;1
If RMS does not find either the [SYSJNL] directory or the node-specific
directory, RMS creates them automatically.
Because DECdtm Services is not supported in a multiple kernel threads
environment and RMS recovery unit journaling relies on DECdtm Services,
RMS recovery unit journaling is not supported in a process with
multiple kernel threads enabled.
You can use after-image (AI) journaling to recover a data file that becomes unusable or inaccessible. AI recovery uses the AI journal file to roll forward a backup copy of the data file to produce a new copy of the data file at the point of failure.
In the case of either a process deletion or system crash, an update can be written to the AI journal file, but not make it to the data file. If only AI journaling is in use, the data file and journal are not automatically made consistent. If additional updates are made to the data file and are recorded in the AI journal, a subsequent roll forward operation could produce an inconsistent data file.
If you use Recovery Unit (RU) journaling with AI journaling, the automatic transaction recovery restores consistency between the AI journal and the data file.
Under some circumstances, an application that uses only AI journaling can take proactive measures to guard against data inconsistencies after process deletions or system crashes. For example, a manual roll forward of AI-journaled files ensures consistency after a system crash involving either an unshared AI application (single accessor) or a shared AI application executing on a standalone system.
However, in a shared AI application, there may be nothing to prevent further operations from being executed against a data file that is out of synchronization with the AI journal file after a process deletion or system crash in a cluster. Under these circumstances, consistency among the data files and the AI journal file can be provided by using a combination of AI and RU journaling.