 |
The Question is:
My application runs in a VMS Cluster consisting of 2 nodes. The app has its
own "failover" processes which use a middleware product for the interprocess
communicaion between the 2 Failover Managers (one on each node). The
middleware product uses the net
work. When a network problem occurs, both Failover Managers assume that their
partner node is down, when in fact the node is ok, it's the network that's down.
Is there a means of interprocess communication in a VMS cluster that does not
use the network?
The Answer is :
If you have a two-node OpenVMS Cluster, you must have a quorum disk,
or you must have a primary-secondary configuration, or you must add
a third node. Please see the OpenVMS FAQ for details of the VOTES
and EXPECTED_VOTES system parameter settings.
You do not indicate the particular cluster configuration -- there
are configurations that use the same Ethernet or FDDI network that
you seem to be having problems with, and there are configurations
Interestingly, that your two failover managers continued to operate
indicates this was a network protocol or routing failure, or that
there is an additional interconnect between these two cluster nodes.
(Had the cluster been partitioned, one or both of the nodes would
have paused pending quorum.)
That said, you will want to examine the use and operation of the
distributed lock manager (DLM), as the DLM is the component that
is specifically targeted at this class of problem. Using the DLM,
an application can be designed to permit exactly one and only one
process to hold a particular resource name, and the holder of this
lock resource will be assumed to be the primary. For details on
uses of the DLM please see the OpenVMS Programming Concepts Manual.
Please note: the DLM is the mechanism that supports synchronization
of the file system. The DLM is designed to be capable of synchronizing
completely arbitrary activities among any cooperating applications.
OpenVMS VV7.2 and later IntraCluster Communications Services (ICC),
which allows a user process directly communicate via the Systems
Communications Services (SCS) protocols used within an OpenVMS
Cluster.
With OpenVMS V5.0 and later, the peer node status can be determined
using $GETSYI and the state of the peer process(es) can be determine
using $GETJPI. With more recent OpenVMS versions (V7.*), you can
request notifications of arriving and departing cluster members using
the cluster event services ($SETCLUEVT, etc).
 |
|
|
 |
|