The Question is:
This question is about queue management. We have an Alpha cluster (2 ES40 with
one DS10 as quorum node), OpenVMS version is V7.2-1. The queue manager master
file QMAN$MASTER.DAT is defined and put on a cluster common disk, which can be
accessed by both of
the two ES40s. Everything works fine until yesterday. We shut down one of the
ES40 for maintenance. But when we tried to reboot that system, after it joins
the cluster, it just hangs there. Nobody can log into that system at then. We
tried to comment STA
RT/QUEUE/MANAGER from the system startup file to get into the system, somehow
the system automatically start queue manager itself and hang again after it
reboot. Before the system got panic, we have a chance to execute the DCL
command SHOW QUEUE/MANAGER.
The system console shows queue manager is "starting". I understand that it is
not a correct status for queue manager (it is either to be running or
stopped). Also an error message shows something like "cannot access queue
database file QMAN$MASTER.DAT."
To fix the problem, we have to redefine the system logical QMAN$MASTER point to
another location instead of the common location, created a new queue database
file just for that server. Then restart the system again, it booted without
problem but I know th
e queue is not shared any more.
Can you explain to me how and why this happen?
The Answer is :
The OpenVMS Wizard can answer your question (only) if you can provide
details on specifically what changed during the "maintenance", and in
the interval since the last (successful) reboot. Without this level
of detail, there can unfortunately be no particular answer.
Items such as incorrectly changing the local cluster host name or failing
to mount the disk where the shared files reside can all cause problems
with the system and with the queue startup. For details on changing the
nodename, please see the FAQ.
You will want to have the correct set of logical name definitions for
cluster operations, and you will want to have these logical names
defined in SYLOGICALS.COM.
A list of required logical name definitions for operation of a shared
cluster are included in SYLOGICALS.TEMPLATE (in V7.2 and later). Please
see the OpenVMS Cluster and System Manager's documentation, as this
documentation explains the various individual logical names mean.
You will also want to acquire and apply the mandatory ECO kits for
OpenVMS, though it is far from clear that these ECOs are related to
or will resolve the cause of this specific queue manager problem.