HP OpenVMS Systems Documentation
HP Fortran for OpenVMS
To specify the types of procedures to be inlined, use the /OPTIMIZE=INLINE=keyword keywords. Also, compile multiple source files together and specify an adequate optimization level, such as /OPTIMIZE=LEVEL=4.
If you omit /OPTIMIZE=INLINE=keyword, the optimization level /OPTIMIZE=LEVEL=n qualifier used determines the types of procedures that are inlined.
The /OPTIMIZE=INLINE=keyword keywords are as follows:
For information on the inlining of other procedures (inlined at optimization level /OPTIMIZE=LEVEL=4 or higher), see Section 188.8.131.52.
Maximizing the types of procedures that are inlined usually improves run-time performance, but compile-time memory usage and the size of the executable program may increase.
To determine whether using /OPTIMIZE=INLINE=ALL benefits your
particular program, time program execution for the same program
compiled with and without /OPTIMIZE=INLINE=ALL.
5.8.6 Requesting Optimized Code for a Specific Processor Generation (Alpha only)
You can specify the types of optimized code to be generated by using the /OPTIMIZE=TUNE=keyword (Alpha only) keywords. Regardless of the specified keyword, the generated code will run correctly on all implementations of the Alpha architecture. Tuning for a specific implementation can improve run-time performance; it is also possible that code tuned for a specific target may run slower on another target.
Specifying the correct keyword for /OPTIMIZE=TUNE=keyword (Alpha only) for the target processor generation type usually slightly improves run-time performance. Unless you request software pipelining, the run-time performance difference for using the wrong keyword for /OPTIMIZE=TUNE=keyword (such as using /OPTIMIZE=TUNE=EV4 for an EV5 processor) is usually less than 5%. When using software pipelining (using /OPTIMIZE=LEVEL=5) with /OPTIMIZE=TUNE=keyword, the difference can be more than 5%.
The combination of the specified keyword for /OPTIMIZE=TUNE=keyword and the type of processor generation used has no effect on producing the expected correct program results.
The /OPTIMIZE=TUNE=keyword keywords are as follows:
If you omit /OPTIMIZE=TUNE=keyword, if /FAST is specified,
then HOST is used; otherwise, GENERIC is used.
5.8.7 Requesting Generated Code for a Specific Processor Generation (Alpha only)
You can specify the types of instructions that will be generated for the program unit being compiled by using the /ARCHITECTURE qualifier. Unlike the /OPTIMIZE=TUNE=keyword (Alpha only) option that helps with proper instruction scheduling, the /ARCHITECTURE qualifier specifies the type of Alpha chip instructions that can be used.
Programs compiled with the /ARCHITECTURE=GENERIC option (default) run on all Alpha processors without instruction emulation overhead.
For example, if you specify /ARCHITECTURE=EV6, the code generated will run very fast on EV6 systems, but may run slower on older Alpha processor generations. Because instructions used for the EV6 chip may be present in the program's generated code, code generated for an EV6 system may slow program execution on older Alpha processors when EV6 instructions are emulated by the OpenVMS Alpha Version 7.1 (or later) instruction emulator.
This instruction emulator allows new instructions, not implemented on the host processor chip, to execute and produce correct results. Applications using emulated instructions will run correctly, but may incur significant software emulation overhead at run time.
The keywords used by /ARCHITECTURE=keyword are the same as
those used by /OPTIMIZE=TUNE=keyword. If you omit
/ARCHITECTURE=keyword, if /FAST is specified then HOST is
used; otherwise, GENERIC is used. For more information on the
/ARCHITECTURE qualifier, see Section 2.3.6.
5.8.8 Arithmetic Reordering Optimizations
If you use the /ASSUME=NOACCURACY_SENSITIVE qualifier, HP Fortran may reorder code (based on algebraic identities) to improve performance. For example, the following expressions are mathematically equivalent but may not compute the same value using finite precision arithmetic:
X = (A + B) + C X = A + (B + C)
The results can be slightly different from the default (ACCURACY_SENSITIVE) because of the way intermediate results are rounded. However, the NOACCURACY_SENSITIVE results are not categorically less accurate than those gained by the default. In fact, dot product summations using NOACCURACY_SENSITIVE can produce more accurate results than those using ACCURACY_SENSITIVE.
|Unoptimized Code||Optimized Code|
|T = 1/V|
|DO I=1,N||DO I=1,N|
|B(I) = A(I)/V||B(I) = A(I)*T|
|END DO||END DO|
The transformation in the optimized loop increases performance
significantly, and loses little or no accuracy. However, it does have
the potential for raising overflow or underflow arithmetic exceptions.
5.8.9 Dummy Aliasing Assumption
Some programs compiled with HP Fortran (or Compaq Fortran 77) might have results that differ from the results of other Fortran compilers. Such programs might be aliasing dummy arguments to each other or to a variable in a common block or shared through use association, and at least one variable access is a store. Alternatively, they may be calling a user-defined procedure with actual arguments that do not match the procedure's dummy arguments in order, number, or type.
This program behavior is prohibited in programs conforming to the Fortran 90 and Fortran 95 standards, but not by HP Fortran. Other versions of Fortran allow dummy aliases and check for them to ensure correct results. However, HP Fortran assumes that no dummy aliasing will occur, and it can ignore potential data dependencies from this source in favor of faster execution.
The HP Fortran default is safe for programs conforming to the Fortran 90 and Fortran 95 standards. It will improve performance of these programs, because the standard prohibits such programs from passing overlapped variables or arrays as actual arguments if either is assigned in the execution of the program unit.
The /ASSUME=DUMMY_ALIASES qualifier allows dummy aliasing. It ensures correct results by assuming the exact order of the references to dummy and common variables is required. Program units taking advantage of this behavior can produce inaccurate results if compiled with /ASSUME=NODUMMY_ALIASES.
Example 5-3 is taken from the DAXPY routine in the Fortran-77 version of the Basic Linear Algebra Subroutines (BLAS).
|Example 5-3 Using the /ASSUME =DUMMY_ALIASES Qualifier|
SUBROUTINE DAXPY(N,DA,DX,INCX,DY,INCY) ! Constant times a vector plus a vector. ! uses unrolled loops for increments equal to 1. DOUBLE PRECISION DX(1), DY(1), DA INTEGER I,INCX,INCY,IX,IY,M,MP1,N ! IF (N.LE.0) RETURN IF (DA.EQ.0.0) RETURN IF (INCX.EQ.1.AND.INCY.EQ.1) GOTO 20 ! Code for unequal increments or equal increments ! not equal to 1. . . . RETURN ! Code for both increments equal to 1. ! Clean-up loop 20 M = MOD(N,4) IF (M.EQ.0) GOTO 40 DO I=1,M DY(I) = DY(I) + DA*DX(I) END DO IF (N.LT.4) RETURN 40 MP1 = M + 1 DO I = MP1, N, 4 DY(I) = DY(I) + DA*DX(I) DY(I + 1) = DY(I + 1) + DA*DX(I + 1) DY(I + 2) = DY(I + 2) + DA*DX(I + 2) DY(I + 3) = DY(I + 3) + DA*DX(I + 3) END DO RETURN END SUBROUTINE
The second DO loop contains assignments to DY. If DY is overlapped with DA, any of the assignments to DY might give DA a new value, and this overlap would affect the results. If this overlap is desired, then DA must be fetched from memory each time it is referenced. The repetitious fetching of DA degrades performance.
You can link routines compiled with the /ASSUME=DUMMY_ALIASES qualifier to routines compiled with /ASSUME=NODUMMY_ALIASES. For example, if only one routine is called with dummy aliases, you can use /ASSUME=DUMMY_ALIASES when compiling that routine, and compile all the other routines with /ASSUME=NODUMMY_ALIASES to gain the performance value of that qualifier.
Programs calling DAXPY with DA overlapping DY do not conform to the
FORTRAN-77, Fortran 90, and Fortran 95 standards. However, they are
supported if /ASSUME=DUMMY_ALIASES was used to compile the DAXPY
5.9 Compiler Directives Related to Performance
Certain compiler source directives (cDEC$ prefix) can be used in place of some performance-related compiler options and provide more control of certain optimizations, as discussed in the following sections:
5.9.1, Using the cDEC$ OPTIONS Directive
5.9.2, Using the cDEC$ UNROLL Directive to Control Loop Unrolling
5.9.3, Using the cDEC$ IVDEP Directive to Control Certain Loop Optimizations
The cDEC$ OPTIONS directive allows source code control of the alignment of fields in record structures and data items in common blocks. The fields and data items can be naturally aligned (for performance reasons) or they can be packed together on arbitrary byte boundaries.
Using this directive is an alternative to the compiler option /[NO]ALIGNMENT, which affects the alignment of all fields in record structures and data items in common blocks in the current program unit.
For more information:
See the description of the OPTIONS directive in the HP Fortran for OpenVMS Language Reference Manual.
5.9.2 Using the cDEC$ UNROLL Directive to Control Loop Unrolling
The cDEC$ UNROLL directive allows you to specify the number of times certain counted DO loops will be unrolled. Place the cDEC$ UNROLL directive before the DO loop you want to control the unrolling of.
Using this directive for a specific loop overrides the value specified by the compiler option /OPTIMIZE=UNROLL= for that loop. The value specified by unroll affects how many times all loops not controlled by their respective cDEC$ UNROLL directives are unrolled.
For more information:
See the description of the UNROLL directive in the HP Fortran for OpenVMS Language Reference Manual.
5.9.3 Using the cDEC$ IVDEP Directive to Control Certain Loop Optimizations
The cDEC$ IVDEP directive allows you to help control certain optimizations related to dependence analysis in a DO loop. Place the cDEC$ IVDEP directive before the DO loop you want to help control the optimizations for. Not all DO loops should use this directive.
The cDEC$ IVDEP directive tells the optimizer to begin dependence analysis by assuming all dependences occur in the same forward direction as their appearance in the normal scalar execution order. This contrasts with normal compiler behavior, which is for the dependence analysis to make no initial assumptions about the direction of a dependence.
For more information:
See the description of the IVDEP directive in the HP Fortran for OpenVMS Language Reference Manual.
The following topics are addressed in this chapter:
This chapter describes HP Fortran input/output (I/O) as implemented for HP Fortran. It also provides information about HP Fortran I/O in relation to the OpenVMS Record Management Services (RMS) and Run-Time Library (RTL).
HP Fortran assumes that all unformatted data files will be in the same native little endian numeric formats used in memory. If you need to read or write unformatted numeric data (on disk) that has a different numeric format than that used in memory, see Chapter 9.
You can use HP Fortran I/O statements to communicate between processes on either the same computer or different computers.
In HP Fortran, a logical unit is a channel through which data transfer occurs between the program and a device or file. You identify each logical unit with a logical unit number, which can be any nonnegative integer from 0 to a maximum value of 2,147,483,647 (2**31--1).
For example, the following READ statement uses logical unit number 2:
READ (2,100) I,X,Y
This READ statement specifies that data is to be entered from the device or file corresponding to logical unit 2, in the format specified by the FORMAT statement labeled 100.
When opening a file, use the UNIT specifier to indicate the unit number. You can use the LIB$GET_LUN library routine to return a logical unit number not currently in use by your program. If you intend to use LIB$GET_LUN, avoid using logical unit numbers (UNIT) 100 to 119 (reserved for LIB$GET_LUN).
HP Fortran programs are inherently device-independent. The association between the logical unit number and the physical file can occur at run time. Instead of changing the logical unit numbers specified in the source program, you can change this association at run time to match the needs of the program and the available resources. For example, before running the program, a command procedure can set the appropriate logical name or allow the terminal user to type a directory, file name, or both.
The OPEN statement connects a unit number with an external file and allows you to explicitly specify file attributes and run-time options using OPEN statement specifiers (all files except internal files are called external files).
ACCEPT, TYPE, and PRINT statements do not refer explicitly to a logical unit (a file or device) from which or to which data is to be transferred; they refer implicitly to a default preconnected logical unit. The ACCEPT statement is normally preconnected to the default input device, and the TYPE and PRINT statements are normally preconnected to the default output device. These defaults can be overridden with appropriate logical name assignments (see Section 184.108.40.206).
READ, WRITE, and REWRITE statements refer explicitly to a specified logical unit from which or to which data is to be transferred. However, to use a preconnected device for READ (SYS$INPUT) and WRITE (SYS$OUTPUT), specify the unit number as an asterisk (*).
Certain unit numbers are preconnected to OpenVMS standard devices. Unit number 5 is associated with SYS$INPUT and unit 6 with SYS$OUTPUT. At run time, if units 5 and 6 are specified by a record I/O statement (such as READ or WRITE) without having been explicitly opened by an OPEN statement, HP Fortran implicitly opens units 5 and 6 and associates them with their respective operating system standard I/O files.
Table 6-1 lists the HP Fortran I/O statements.
|Category and Statement Name||Description|
|OPEN||Connects a unit number with an external file and specifies file connection characteristics.|
|CLOSE||Disconnects a unit number from an external file.|
|INQUIRE||Returns information about a named file, a connection to a unit, or the length of an output item list.|
|BACKSPACE||Moves the record position to the beginning of the previous record (sequential access only).|
|ENDFILE||Writes an end-of-file marker after the current record (sequential access only).|
|REWIND||Sets the record position to the beginning of the file (sequential access only).|
|READ||Transfers data from an external file record or an internal file to internal storage.|
|WRITE||Transfers data from internal storage to an external file record or to an internal file.|
|Transfers data from internal storage to SYS$OUTPUT (standard output device). Unlike WRITE, PRINT only provides formatted sequential output and does not specify a unit number.|
|HP Fortran Extensions|
|ACCEPT||Reads input from SYS$INPUT. Unlike READ, ACCEPT only provides formatted sequential output and does not specify a unit number.|
|DELETE||Marks a record at the current record position in a relative file as deleted (direct access only).|
|REWRITE||Transfers data from internal storage to an external file record at the current record position. Certain restrictions apply.|
|UNLOCK||Releases a lock held on the current record when file sharing was requested when the file was opened (see Section 6.9.2).|
|TYPE||Writes record output to SYS$OUTPUT (same as PRINT).|
|DEFINE FILE||Specifies file characteristics for a direct access relative file and connects the unit number to the file (like an OPEN statement). Provided for compatibility with compilers older than FORTRAN-77.|
|FIND||Changes the record position in a direct access file. Provided for compatibility with compilers older than FORTRAN-77.|