The Question is:
two node alphaserver 4100 with storageworks and hsz50 and 100mbps NIC
(full-duplex) on a cluster with a quorum disk does memory dump and reboot one
of the node to "have connection to quorum disk" and hangs when the network
switch is reset. it will remain
at this sport until the second node is rebooted and the RSM promt (>>>) appears
before it can completely boot. What is the likely cause of this kind of problem.
The Answer is :
If resetting the network switch causes a loss of connection to the
remote system, then there may be a problem with the OpenVMS SCS
communications or with the switch itself. The workaround, rather
obviously, involves avoiding the network switch reset which is the
apparent trigger of the loss of communications.
If the network itself is unstable, you will want to configure a local
network for the cluster, or add CI, MC, or other cluster communications
interconnect into the configuration.
In this case, it appears that one or both of the hosts lack a connection
to the quorum disk, or the cluster itself has lost quorum. Details of
the cluster hardware and software configuration are lacking, so a more
specific answer is difficult at best.
Please see the OpenVMS FAQ for details on correctly setting the
system parameters VOTES and EXPECTED_VOTES.
Please apply the available OpenVMS ECO kits for OpenVMS V7.2-1.
With two AlphaServer 4100 series systems with SCSI and an Ethernet
connection, the quorum disk would often be configured as a disk on
a multi-host SCSI bus. If the quorum disk is not accessable on a
shared storage or communications interconnect, then the use of the
quorum disk can and likely should be entirely eliminated.
Failing resolution via ECO or network stabilizations, please contact
the Compaq Customer Support Center directly.