HP OpenVMS Systems Documentation
HP OpenVMS Cluster Systems
Table A-2 lists system parameters that should not require adjustment at any time. These parameters are provided for use in system debugging. HP recommends that you do not change these parameters unless you are advised to do so by your HP support representative. Incorrect adjustment of these parameters can result in cluster failures.
Print a listing of SYSUAF.DAT on each computer. To print this listing,
invoke AUTHORIZE and specify the AUTHORIZE command LIST as follows:
$ SET DEF SYS$SYSTEM
Use the listings to compare the accounts from each computer. On the
listings, mark any necessary changes. For example:
|3||Choose the SYSUAF.DAT file from one of the computers to be a master SYSUAF.DAT.|
Merge the SYSUAF.DAT files from the other computers to the master
SYSUAF.DAT by running the Convert utility (CONVERT) on the computer
that owns the master SYSUAF.DAT. (See the OpenVMS Record Management Utilities Reference Manual for a
description of CONVERT.) To use CONVERT to merge the files, each
SYSUAF.DAT file must be accessible to the computer that is running
Note that if a given user name appears in more than one source file, only the first occurrence of that name appears in the merged file.
Example: The following command sequence example
creates a new SYSUAF.DAT file from the combined contents of the two
The CONVERT command in this example adds the records from the files [SYS1.SYSEXE]SYSUAF.DAT and [SYS2.SYSEXE]SYSUAF.DAT to the file SYSUAF.DAT on the local computer.
After you run CONVERT, you have a master SYSUAF.DAT that contains records from the other SYSUAF.DAT files.
|5||Use AUTHORIZE to modify the accounts in the master SYSUAF.DAT according to the changes you marked on the initial listings of the SYSUAF.DAT files from each computer.|
|6||Place the master SYSUAF.DAT file in SYS$COMMON:[SYSEXE].|
|7||Remove all node-specific SYSUAF.DAT files.|
If you need to merge RIGHTSLIST.DAT files, you can use a command sequence like the following:
$ ACTIVE_RIGHTSLIST = F$PARSE("RIGHTSLIST","SYS$SYSTEM:.DAT") $ CONVERT/SHARE/STAT 'ACTIVE_RIGHTSLIST' RIGHTSLIST.NEW $ CONVERT/MERGE/STAT/EXCEPTION=RIGHTSLIST_DUPLICATES.DAT - _$ [SYS1.SYSEXE]RIGHTSLIST.DAT, [SYS2.SYSEXE]RIGHTSLIST.DAT RIGHTSLIST.NEW $ DUMP/RECORD RIGHTSLIST_DUPLICATES.DAT $ CONVERT/NOSORT/FAST/STAT RIGHTSLIST.NEW 'ACTIVE_RIGHTSLIST'
The commands in this example add the RIGHTSLIST.DAT files from two OpenVMS Cluster computers to the master RIGHTSLIST.DAT file in the current default directory. For detailed information about creating and maintaining RIGHTSLIST.DAT files, see the security guide for your system.
This appendix contains information to help you perform troubleshooting operations for the following:
If, after performing preliminary checks and taking appropriate
corrective action, you find that a computer still fails to boot or to
join the cluster, you can follow the procedures in Sections
C.2 through C.3 to attempt recovery.
C.1.2 Sequence of Booting Events
To perform diagnostic and recovery procedures effectively, you must understand the events that occur when a computer boots and attempts to join the cluster. This section outlines those events and shows typical messages displayed at the console.
Note that events vary, depending on whether a computer is the first to boot in a new cluster or whether it is booting in an active cluster. Note also that some events (such as loading the cluster database containing the password and group number) occur only in OpenVMS Cluster systems on a LAN or IP.
The normal sequence of events is shown in Table C-1.
The computer boots. If the computer is a satellite, a message like the
following shows the name and LAN address of the MOP server that has
downline loaded the satellite. At this point, the satellite has
completed communication with the MOP server and further communication
continues with the system disk server, using OpenVMS Cluster
%VAXcluster-I-SYSLOAD, system loaded from Node X...For any booting computer, the OpenVMS "banner message" is displayed in the following format:
operating-system Version n.n dd-mmm-yyyy hh:mm.ss
The computer attempts to form or join the cluster, and the following
waiting to form or join an OpenVMS Cluster system
If the computer is a member of an OpenVMS Cluster based on the LAN,
the cluster security database (containing the cluster password and
group number) is loaded. Optionally, the MSCP server, and TMSCP server
can be loaded:
If the computer is a member of an OpenVMS Cluster based on IP, the
IP configuration file is also loaded along with the cluster security
database, the MSCP server and the TMSCP server:
For IP-based cluster communication, the IP interface and TCP/IP
services are enabled. The multicast and unicast addresses are added to
the list of IP bus, WE0 and sends the Hello packet:
If the computer discovers a cluster, the computer attempts to join it.
If a cluster is found, the connection manager displays one or more
messages in the following format:
%CNXMAN, Sending VAXcluster membership request to system X...
Otherwise, the connection manager forms the cluster when it has enough votes to establish quorum (that is, when enough voting computers have booted).
As the booting computer joins the cluster, the connection manager
displays a message in the following format:
%CNXMAN, now a VAXcluster member -- system X...
Note that if quorum is lost while the computer is booting, or if a
computer is unable to join the cluster within 2 minutes of booting, the
connection manager displays messages like the following:
The last two messages show any connections that have already been formed.
If the cluster includes a quorum disk, you may also see messages like
%CNXMAN, Using remote access method for quorum disk
The first message indicates that the connection manager is unable to access the quorum disk directly, either because the disk is unavailable or because it is accessed through the MSCP server. Another computer in the cluster that can access the disk directly must verify that a reliable connection to the disk exists.
The second message indicates that the connection manager can access the quorum disk directly and can supply information about the status of the disk to computers that cannot access the disk directly.
Note: The connection manager may not see the quorum disk initially because the disk may not yet be configured. In that case, the connection manager first uses remote access, then switches to local access.
Once the computer has joined the cluster, normal startup procedures
execute. One of the first functions is to start the OPCOM process:
%%%%%%%%%%% OPCOM 15-JAN-1994 16:33:55.33 %%%%%%%%%%%
As other computers join the cluster, OPCOM displays messages like the
%%%%% OPCOM 15-JAN-1994 16:34:25.23 %%%%% (from node X...)
As startup procedures continue, various messages report startup events.
Hint: For troubleshooting purposes, you can include in your site-specific startup procedures messages announcing each phase of the startup process---for example, mounting disks or starting queues.
C.2 Satellite Fails to Boot
To boot successfully, a satellite must communicate with a MOP server
over the LAN or IP. You can use the DECnet event logging feature to
verify this communication. Perform the following procedure:
|1||Log in as system manager on the MOP server.|
If event logging for management-layer events is not already enabled,
enter the following NCP commands to enable it:
NCP> SET LOGGING MONITOR EVENT 0.*
Enter the following DCL command to enable the terminal to receive
DECnet messages reporting downline load events:
Boot the satellite. If the satellite and the MOP server can communicate
and all boot parameters are correctly set, messages like the following
are displayed at the MOP server's terminal:
DECnet event 0.3, automatic line service
Sections C.2.2 through C.2.5 provide more information
about satellite boot troubleshooting and often recommend that you
ensure that the system parameters are set correctly.
C.2.1 Displaying Connection Messages
To enable the display of connection messages during a conversational boot, perform the following steps:
|1||Enable conversational booting by setting the satellite's NISCS_CONV_BOOT system parameter to 1. On Integrity servers and Alpha systems, update the ALPHAVMSSYS.PAR file and on Integrity server systems update the IA64VMSSYS.PAR file in the system root on the disk server.|
Perform a conversational boot.
On Integrity servers and Alpha systems, enter the following command
at the console:
On VAX systems, set bit <0> in register R5. For example, on a
VAXstation 3100 system, enter the following command on the console:
Observe connection messages.
Display connection messages during a satellite boot to determine which system in a large cluster is serving the system disk to a cluster satellite during the boot process. If booting problems occur, you can use this display to help isolate the problem with the system that is currently serving the system disk. Then, if your server system has multiple LAN adapters, you can isolate specific LAN adapters.
Isolate LAN adapters.
Isolate a LAN adapter by methodically rebooting with only one adapter connected. That is, disconnect all but one of the LAN adapters on the server system and reboot the satellite. If the satellite boots when it is connected to the system disk server, then follow the same procedure using a different LAN adapter. Continue these steps until you have located the bad adapter.
Reference: See also Appendix C for help with troubleshooting satellite booting problems.