 |
» |
|
|
 |
 |
|
 |
 |
OpenVMS recommends that code that will run on the 21264
processor be checked for these sequences. Particular
attention should be paid to any code that does interprocess
locking, multithreading, or interprocessor communication.
The SRM_CHECK tool (named after the System Reference
Manual, which defines the Alpha architecture) has been
developed to analyze Alpha executables for noncompliant
code sequences. The tool will detect sequences that may
fail, report any errors, and display the machine code of
the failing sequence.
Using the Code Analysis Tool
The SRM_CHECK tool is located at:
» http://h71000.www7.hp.com/openvms/srm_check.exe
To run the SRM_CHECK tool, define it as a foreign image
(or use the DCL$PATH mechanism) and invoke it with the name
of the image to check. If a problem is found, the machine
code will be displayed and some image information will be
printed. The following example illustrates how to use the
tool to analyze an image called myimage.exe:
$ define DCL$PATH []
$ srm_check myimage.exe
The tool supports wildcard searches. Use the following
command line to initiate a wildcard search:
$ srm_check [*...]* -log
Use the -log qualifier to generate a list of which images
have been checked. The -output qualifier can be used
to write the output to a data file, as in the following
example that writes to a file named check.dat.
$ srm_check 'file' -output check.dat
The output from the tool can be used to find the module
that generated the sequence by looking in the image's
MAP file. The addresses shown correspond directly to the
addresses that can be found in the MAP file.
The following example illustrates the output from using the
analysis tool on an image named system_synchronization.exe
|
** Potential Alpha Architecture Violation(s) found in file... |
|
** Found an unexpected ldq at 00003618 |
| 0000360C |
AD970130 |
ldq_l |
R12, 0x130(R23) |
| 00003610 |
4596000A |
and |
R12, R22, R10 |
| 00003614 |
F5400006 |
bne |
R10, 00003630 |
| 00003618 |
A54B0000 |
ldq |
R10, (R11) |
|
Image Name: SYSTEM_SYNCHRONIZATION |
|
Image Ident: X-3 |
|
Link Time: 5-NOV-1998 22:55:58.10 |
|
Build Ident: X6P7-SSB-0000 |
|
Header Size: 584 |
|
Image Section: 0, vbn: 3, va: 0x0, flags: RESIDENT EXE (0x880) |
The MAP file for system_synchronization.exe contains the
following:
| EXEC$NONPAGED_CODE |
00000000 |
0000B317 |
0000B318 ( 45848.) |
2 ** 5 |
| SMPROUT |
00000000 |
000047BB |
000047BC ( 18364.) |
2 ** 5 |
| SMPINITIAL |
000047C0 |
000061E7 |
00001A28 ( 6696.) |
2 ** 5 |
The address 360C is in the SMPROUT module (which contains
the addresses from 0-47BB). By looking at the machine code
output from the module, you can locate the code and use the
listing line number to identify the corresponding source
code. If SMPROUT had a nonzero base, it would be necessary
to subtract the base from the address (360C in this case)
to find the relative address in the listing file.
Note that the tool reports potential violations in its
output. Although SRM_CHECK can normally identify a code
section in an image by the section's attributes, it is
possible for OpenVMS images to contain data sections with
those same attributes. As a result, SRM_CHECK may scan data
as if it were code, and occasionally, a block of data may
look like a noncompliant code sequence. This has also been
found to be quite rare. This circumstance can be detected
in the same way the noncompliant source code is found, by
examining the MAP and listing files.
|
 |
 |
|
 |
 |
The areas of noncompliance detected by the SRM_CHECK tool
can be grouped into the following four categories. Most of
these can be fixed by recompiling with new compilers. In
rare cases, the source code may need to be modified. See
Section 5 for information about compiler versions.
- Some versions of OpenVMS compilers introduce
noncompliant code sequences during an optimization
called "loop rotation." This problem can only be
triggered in C or C++ programs which use LDx_L/STx_C
instructions in assembly language code that is embedded
in the C/C++ source using the ASM function, or in
assembly language written in MACRO-32 or MACRO-64. In
some cases, a branch was introduced between the LDx_L
and STx_C instructions.
This can be addressed by recompiling.
- Some code compiled with very old Bliss and MACRO-32
compilers may contain noncompliant sequences. Early
versions of these compilers contained a code scheduling
bug where a load was incorrectly scheduled after a load_
locked.
This can be addressed by recompiling.
- The MACRO-32 compiler may generate a noncompliant code
sequence for a BBSSI or BBCCI instruction in rare cases
where there are too few free registers.
This can be addressed by recompiling.
- Incorrectly coded MACRO-64 or MACRO-32 and incorrectly
coded assembly language embedded in C or C++ source
using the ASM function.
This requires source code changes. The new MACRO-32
compiler will flag noncompliant code at compile time.
If the SRM_CHECK tool finds a violation in an image, the
image should be recompiled with the appropriate compiler
(see Section 5). After recompiling, the image should be
analyzed again. If violations remain after recompiling,
source code must be examined to determine why the code
scheduling violation exists. Modifications should then be
made to the source code.
|
 |
 |
|
 |
 |
The Alpha Architecture Reference Manual describes how an
atomic update of data between processors must be formed.
The Third Edition, in particular, has expanded greatly on
this topic. In this edition, Section 5.5, "Data Sharing",
and Section 4.2.4, which describes the LDx_L instructions,
detail the conventions of the interlocked memory sequence.
The following two requirements are the source of all known
noncompliant code:
- There cannot be a memory operation (load or store)
between the LDx_L (load locked) and STx_C (store
conditional) instructions in an interlocked sequence
- There cannot be a branch taken between a LDx_L and a
STx_C instruction. Rather, execution must "fall through"
from the LDx_L to the STx_C without taking a branch.
Any branch whose target is between a LDx_L and matching
STx_C creates a noncompliant sequence. For example,
any branch to "label" in the following would result
in noncompliant code, regardless of whether the
branch instruction itself was within or outside of the
sequence:
LDx_L Rx, n(Ry)
...
label: ...
STx_C Rx, n(Ry)
Therefore, the SRM_CHECK tool looks for the following:
- Any memory operation (LDx/STx) between a LDx_L and a
STx_C.
- Any branch which has a destination between a LDx_L and
STx_C.
- STx_C instructions that do not have a preceding LDx_L
instruction.
This typically indicates that a backward branch is taken
from a LDx_L to the STx_C. Note that hardware device
drivers that do device mailbox writes are an exception,
and use the STx_C to write the mailbox. This is only
found on early Alpha systems, and not on PCI based
systems.
- Excessive instructions between a LDx_L and STxC.
The AARM recommends that no more than 40 instructions
appear between a LDx_l and STx_c. In theory, more
than 40 instructions can cause hardware interrupts to
keep the sequence from completing. There are no known
occurrences of this.
To illustrate, the following are examples of code flagged
by SRM_CHECK.
|
** Found an unexpected ldq at 0008291C |
| 00082914 |
AC300000 |
ldq_l |
R1, (R16) |
| 00082918 |
2284FFEC |
lda |
R20, 0xFFEC(R4) |
| 0008291C |
A6A20038 |
ldq |
R21, 0x38(R2) |
In the above example, a LDQ instruction was found after a
LDQ_L before the matching STQ_C. The LDQ must be moved out
of the sequence, either by recompiling or by source code
changes. (See Section 3.)
|
** Backward branch from 000405B0 to a STx_C sequence at 0004059C |
| 00040598 |
C3E00003 |
br |
R31, 000405A8 |
| 0004059C |
47F20400 |
bis |
R31, R18, R0 |
| 000405A0 |
B8100000 |
stl_c |
R0, (R16) |
| 000405A4 |
F4000003 |
bne |
R0, 000405B4 |
| 000405A8 |
A8300000 |
ldl_l |
R1, (R16) |
| 000405AC |
40310DA0 |
cmple |
R1, R17, R0 |
| 000405B0 |
F41FFFFA |
bne |
R0, 0004059C |
In the above example, a branch was discovered between the
LDL_L and STQ_C. In this case, there is no "fall through"
path between the LDx_L and STx_C, which the architecture
requires.
| Note |
This branch backward from the LDx_L to the STx_C is
characteristic of the noncompliant code introduced by
the "loop rotation" optimization. |
The following MACRO-32 source code demonstrates code
where there is a fall through path, but that is still
noncompliant because of the potential branch, AND a memory
reference in the lock sequence.
| getlck: |
evax_ldql |
r0, lockdata(r8) |
; get the lock data |
| |
movl |
index, r2 |
; and the current index |
| |
tstl |
r0 |
; If the lock is zero, |
| |
beql |
is_clear |
; skip ahead to store |
| is_clear: |
incl |
r0 |
; increment lock count |
| |
evax_stqc |
r0, lockdata(r8) |
; and store it |
| |
tstl |
r0 |
; did store succeed? |
| |
beql |
getlck |
; retry if not |
To correct this code, the memory access to read the value
of INDEX must first be moved outside the LDQ_L/STQ_C
sequence. Next, the branch between the LDQ_L and STQ_C,
to the label IS_CLEAR, must be eliminated. In this case,
it could be done using a CMOVEQ instruction. The CMOVxx
instructions are frequently useful for eliminating branches
around simple value moves. The following example shows the
corrected code.
| |
movl |
index, r2 |
; and the current index |
| getlck: |
evax_ldql |
r0, lockdata(r8) |
; and then lock the data |
| |
evax_cmoveq |
r0, r3, r2 |
; If zero use special index |
| |
incl |
r0 |
; increment lock count |
| |
evax_stqc |
r0, lockdata(r8) |
; and store it |
| |
tstl |
r0 |
; did write succeed? |
| |
beql |
getlck |
; retry if not |
|
 |
|