HP OpenVMS Systems Documentation
HP OpenVMS Version 8.2 New Features and Documentation Overview
6.4.2 Merge Resulting from Mount Verification Timeout
A shadow set that enters mount verification and either times out or aborts mount verification will enter a merge state if the following conditions are true:
The system on which the mount verification timed out (or aborted mount verification) notifies the other systems on which the shadow set is mounted that a merge operation is needed, and then it will disable the shadow set. (It does not dismount it.)
For example, if a shadow set is mounted on eight systems and mount
verification times out on two of them, those two systems check their
internal queues for write I/O. If any write I/O is found, the shadow
set will need to be merged.
The SET SHADOW/DEMAND_MERGE command initiates a merge of a specified shadow set or of all shadow sets. This qualifier is useful if the shadow set was created with the INITIALIZE/SHADOW command without the use of the /ERASE qualifier.
The SET SHADOW command was introduced in OpenVMS Alpha Version 7.3-2.
For more information about using SET SHADOW/DEMAND_MERGE, refer to
HP OpenVMS DCL Dictionary and to the HP Volume Shadowing for OpenVMS manual.
In a full merge operation, the members of a shadow set are compared with each other to ensure that they contain the same data. This is done by performing a block-by-block comparison of the entire volume. This can be a very lengthy procedure.
A minimerge operation can be significantly faster. By using information about write operations that were logged in volatile controller storage or in a write bitmap on an OpenVMS system, volume shadowing merges only those areas of the shadow set where write activity occurred. This avoids the need for the entire volume scan that is required by full merge operations, thus reducing consumption of system I/O resources.
Prior to the introduction of HBMM, minimerge was controller-based and
available only on the HSJ, HSC, and HSD controllers.
HBMM depends on bitmaps and policies to provide the information required for minimerge operations. Depending on your computing environment, one HBMM policy, a DEFAULT policy that you specify, might be sufficient.
Before you can use HBMM for recovery of a shadow set, the following conditions must be true:
When a policy is associated with a shadow set and the shadow set is mounted on several systems, bitmaps specific to that shadow set are created.
The systems selected from the master list, as specified in the HBMM
policy definition, can perform a minimerge operation because they
possess the master bitmaps. All other systems on which the shadow set
is mounted possess a local bitmap for each master bitmap.
For a given bitmap, there is exactly one master version on some system in the cluster and a local version on every other system that has the associated shadow set mounted. A minimerge operation can occur only on a system with a master bitmap. A shadow set can have up to six HBMM master bitmaps. Multiple master bitmaps for the same shadow set are equivalent but they do have different bitmap IDs.
The following example shows two master bitmaps for DSA12, one on RAIN and one on SNOW, each with a unique bitmap ID:
If only one master bitmap exists for the shadow set, and the system with the master bitmap fails or is shut down, the bitmap is gone; that is, the remaining local versions are automatically deleted. Local bitmaps cannot be used for recovery.
If multiple master bitmaps were created for the shadow set and at least one remains, that master bitmap can be used for recovery. HP recommends the use of multiple master bitmaps, especially for multiple-site cluster systems. Multiple master bitmaps increase the likelihood of an HBMM operation rather than a full merge in the event of a system failure.
Bitmaps require additional memory. The calculation is based on the
shadow set volume size. For every gigabyte of storage of a shadow set
mounted on a system, 2 KB of bitmap memory is required on that system
for each bitmap. For example, a shadow set with a volume size of 200 GB
of storage and 2 bitmaps uses 800 KB of memory on every system on which
it is mounted.
A policy specifies the following attributes for one or more shadow sets:
You can assign almost any name to a policy. However, the reserved names DEFAULT and NODEFAULT have specific properties that are described in Section 6.7. You can also create a policy without a name and assign it to a specific shadow set. An advantage of a named policy is that it can be reused by specifying only its name.
Multiple policies can be created to customize the minimerge operations in a cluster.
You use the SET SHADOW/POLICY command with HBMM specific qualifiers to
define, assign, deassign, and delete policies and to enable and disable
HBMM on a shadow set. SET SHADOW/POLICY is the only user interface for
specifying HBMM policies. You cannot use the MOUNT command to define a
policy. You can define a policy before the shadow set is mounted.
(Policies can be associated with shadow sets in other ways as well, as
described in Section 6.7.)
An HBMM policy specification consists of a list of HBMM policy keywords enclosed with parentheses. The HBMM policy keywords are MASTER_LIST, COUNT, and RESET_THRESHOLD. Of the three keywords, only MASTER_LIST must be specified. If COUNT and RESET_THRESHOLD are omitted, default values are supplied. (For examples of policy specifications, see Section 6.9.1 and HP OpenVMS DCL Dictionary.)
The use of these keywords and the rules for specifying them are described in this section.
The MASTER_LIST keyword is used to identify a set of systems as candidates for a master bitmap. The system-list value can be a single system name; a parenthesized, comma-separated list of system names; or the asterisk (*) wildcard character. For example:
When the system list consists of a single system name or the wildcard character, parentheses are optional.
An HBMM policy must include at least one MASTER_LIST. Multiple master lists are optional. If a policy has multiple master lists, the entire policy must be enclosed with parentheses, and each constituent master list must be separated by a comma as shown in the following example:
There is no significance to the position of a system name in a master list.
The COUNT keyword specifies the number of the systems, which are named in the master list, that can have a master bitmap. Therefore, the COUNT keyword must be associated with a specific master list by enclosing both with parentheses.
A COUNT value of n means that you want master bitmaps on any n systems in the associated master list. It does not necessarily mean that the first n systems in the list are chosen.
The COUNT keyword is optional. When omitted, the default value is the the number of systems in the master list or the value of 6, whichever is less. You cannot specify more than one COUNT keyword for any one master list.
The following two examples are valid policies:
In contrast, the following example is not valid because the COUNT keyword is not grouped with a specific master list:
The RESET_THRESHOLD keyword specifies the number of blocks that can be set before the bitmap is eligible to be cleared. Each bit that is set in a master bitmap corresponds to a set of blocks that needs to be merged. Therefore, the merge time can be influenced by this value.
Bitmaps are eligible to be cleared when the RESET_THRESHOLD is exceeded. However, the reset is not guaranteed to occur immediately when the threshold is crossed. For additional information about choosing a value for this attribute, see Section 6.8.2 and Section 6.10.2.
A single reset threshold value is associated with any given HBMM policy. Therefore, the RESET_THRESHOLD keyword cannot be specified more than once in a given policy specification. Because its scope is the entire policy, the RESET_THRESHOLD keyword cannot be specified inside a constituent master list when the policy uses multiple master lists.
When the RESET_THRESHOLD keyword is omitted, the value of 50000 is used by default.
The following policy example includes an explicit reset threshold value:
6.7 Rules Governing HBMM Policies
The following rules govern the creation and management of HBMM policies. The rules are based on the assumption that a shadow set is mounted on a system that supports HBMM.
The named policies DEFAULT and NODEFAULT have special properties, as summarized in the following sections:
6.8 Guidelines for Establishing HBMM Policies
Establishing HBMM policies is likely to be an ongoing process as configurations change and as you learn more about how HBMM works and how it affects various operations on your systems. This section describes a number of considerations to help you determine what policies are appropriate for your configuration.
The settings depend on your hardware and software configuration, the
computing load, and your operational requirements. These guidelines
should assist you with choosing the initial settings for your
configuration. As you observe the results in your configuration, you
can make further adjustments to suit your computing environment.
There are several factors to consider when choosing the number of master bitmaps to specify in a policy and the systems that will host the master bitmaps. The first issue is how many master bitmaps should be used in the configuration. Six is the maximum per shadow set. The use of each additional master bitmap has a slight impact on write performance and also consumes memory on each system (as described in Section 6.5.1).
Using only one master bitmap creates a single point of failure; if the system hosting the master bitmap fails, then this shadow sets undergoes a full merge. Therefore, the memory consumption must be weighed against the adverse effects of a full merge. Using six master bitmaps provides the greatest defense against performing full merges.
Another issue when selecting a system to host the master bitmap is the I/O bandwidth of the various systems. Keep in mind that minimerges are always performed on a system that has a master bitmap. Therefore, low-bandwidth systems, such as satellite cluster members, are not good candidates.
The disaster tolerance of the configuration is also important in the
decision process. Specifying systems to host master bitmaps at multiple
sites helps ensure that a minimerge is performed if connectivity to an
entire site is lost. A two-site configuration should ensure that half
the master bitmaps are at each site, and a three-site configuration
should ensure that one third of the master bitmaps are at each of the
HBMM bitmaps keep track of writes to a shadow set. The more bits that are set in the bitmap, the greater the amount of merging that is required in the event of a minimerge. HBMM clears the bitmap (after ensuring that all outstanding writes have completed so that the members are consistent) when certain conditions are met (see Section 6.10.2). A freshly cleared bitmap, with few bits set, performs a minimerge much more quickly.
The bitmap reset, however, can be costly to I/O performance. Before a bitmap reset can occur, all write I/O to the shadow set must be paused and any write I/O that is in flight must be completed. Then the bitmap is cleared. This is done on all systems on a per shadow set basis. Therefore, avoid a reset threshold setting that causes frequent resets.
You can view the number of resets performed by using the SHOW SHADOW command, as shown in the following example:
Writes that need to set bits in the bitmap are slightly slower than writes to areas that are already marked as having been written. Therefore, if many of the writes to a particular shadow set are concentrated in certain "hot" files, then the reset threshold should be made large enough so that the same bits are not constantly set and then cleared.
On the other hand, if the reset threshold is too large, then the advantages of HBMM are reduced. For example, if 50% of the bitmap is populated (that is, 50% of the shadow set has been written to since the last reset), then the HBMM merge will take approximately 50% of the time of a full merge.
When selecting a threshold reset value, you need to balance the effects of bitmap resets on I/O performance with the time it takes to perform HBMM minimerges. The goal is to set the reset value as low as possible (thus decreasing merge times) while not affecting application I/O performance. Too low a value will degrade I/O performance. Too high a value causes HBMM merges to take extra time.
6.8.3 Using Multiple Policies
HBMM policies are defined to implement the decisions regarding master bitmaps. Some sites might find that a single policy can effectively implement the decisions. Other sites might need greater granularity and therefore implement multiple policies.
The most likely need for multiple policies is when the cluster includes enough high-bandwidth systems that you want to ensure that the merge load is spread out. Remember, minimerges occur only on systems that host a master bitmap. So, if 12 systems with high bandwidth are set up to perform minimerge or merge operations (the system parameter SHADOW_MAX_COPY is greater than zero on all systems), then you should ensure that the master bitmaps are spread out among these high-bandwidth systems.
Multiple HBMM policies are also useful when shadow sets need different
bitmap reset thresholds. The master list can be the same for each
policy, but the threshold can differ.
This section describes the major tasks for configuring and managing
The SET SHADOW/POLICY=HBMM command is used to define HBMM policies. You can define multiple policies for your environment. The following examples show how to define two policies, a DEFAULT policy and POLICY_1, a named policy.
To define the policy named DEFAULT:
In this example, a DEFAULT policy is created for the cluster. The use of the asterisk wildcard (*) means that any system can host a master bitmap. The omission of the keyword COUNT=n means that up to six systems (the default and the current maximum supported) can host a master bitmap. The DEFAULT policy is inherited at mount time by shadow sets that have not been assigned a named policy.
The following example defines a named policy (POLICY_1), specifies the systems that are eligible to host a master bitmap, limits to two the number of systems that can host a master bitmap, and specifies a higher threshold (default is 50,000 blocks) to be reached before clearing the bitmap.