HP OpenVMS Systems Documentation 
OpenVMS VAX RTL Mathematics (MTH$) Manual
2.3 Vector Versions of Existing Scalar RoutinesVector forms of many MTH$ routines are provided to support vectorized compiled applications. Vector versions of key Ffloating, Dfloating, and Gfloating scalar routines employ vector hardware, while maintaining identical results with their scalar counterparts. Many of the scalar algorithms have been redesigned to ensure identical results and good performance for both the vector and scalar versions of each routine. All vectorized routines return bitforbit identical results as the scalar versions. You can call the vector MTH$ routines directly if your program is written in VAX MACRO. If you are a Fortran programmer, specify the Fortran intrinsic function name only. The Fortran compiler will then determine whether the vector or scalar version of a routine should be used. 2.3.1 ExceptionsYou should not attempt to recover from an MTH$ vector exception. After an MTH$ vector exception, the vector routines cannot continue execution, and nonexceptional values might not have been computed. 2.3.2 Underflow DetectionIn general, if a vector instruction results in the detection of both a floating overflow and a floating underflow, only the overflow will be signaled. Some scalar routines check to see if a user has enabled underflow detection. For each of those scalar routines, there are two corresponding vector routines: one that always enables underflow checking and one that never enables underflow checking. (In the latter case, underflows produce a result of zero.) The Fortran compiler always chooses the vector version that does not signal underflows, unless the user specifies the /CHECK=UNDERFLOW qualifier. This ensures that the check is performed but does not impair vector performance for those not interested in underflow detection. 2.3.3 Vector Routine Name FormatUse one of the formats in Table 23 to call (from VAX MACRO) a vector math routine that enables underflow signaling. (The E in the routine name means enabled underflow signaling.)
Use one of the formats in Table 24 to call (from VAX MACRO) a vector math routine that does not enable underflow signaling.
In the preceding formats, the following conventions are used:
2.3.4 Calling a Vector Math RoutineYou can call the vector MTH$ routines directly if your program is written in VAX MACRO.
In the following examples, keep in mind that vector real arguments are passed in V0, V1, and so on, and vector real results are returned in V0. On the other hand, vector complex arguments are passed in V0 and V1, V2, and V3, and so on. Vector complex results are returned in V0 and V1.
The following example shows how to call the vector version of MTH$EXP. Assume that you do not want underflows to be signaled, and you need to use the current contents of all vector and scalar registers after the invocation. Before you can call the vector routine from VAX MACRO, perform the following steps.
The following MACRO program fragment shows this example. Assume that:
Note that MTH$VEXP_R3_V6 denotes an Ffloating data type because there is no letter between V and E in the routine name. (For further explanation, refer to Section 2.3.3.) The stride (the number of array elements that are skipped) must be a multiple of 4 because each Ffloating value requires 4 bytes.
The following example demonstrates how to call the vector version of OTS$POWDD with a vector base raised to a scalar power. Before you can call the vector routine from VAX MACRO, perform the following steps.
The following MACRO program fragment shows how to call OTS$VPOWDD_R1_V8 to compute the result of raising 60 values to the power P. Assume that:
Note that OTS$VPOWDD_R1_V8 raises a Dfloating base to a Dfloating power, which you determine from the DD in the routine name. (For further explanation, refer to Section 2.3.3.) The stride (the number of array elements that are skipped) must be a multiple of 8 because each Dfloating value requires 8 bytes.
2.4 FastVector Math RoutinesThis section describes the fastvector math routines that offer significantly higher performance at the cost of slightly reduced accuracy when compared with corresponding standard vector math routines. Also note that some fastvector math routines have restricted argument domains. When you specify the compile command qualifiers /VECTOR and /MATH_LIBRARY=FAST, the Compaq Fortran compiler selects the appropriate fastvector math routine, if one exists. The default is /MATH_LIBRARY=ACCURATE. You must specify the /G_FLOATING compile qualifier in conjunction with the /MATH_LIBRARY=FAST and /VECTOR qualifiers to access the G_floating routines. You can call these routines from VAX MACRO using the standard calling method. The math function names, together with corresponding entry points of the fastvector math routines, are listed in Table 25.
2.4.1 Exception HandlingThe fastvector math routines signal all errors except floating underflow. No intermediate calculations result in exceptions. To optimize performance, the following message signals all errors:%SYSTEMFVARITH, vector arithmetic fault 2.4.2 Special Restrictions On Input ArgumentsThe special restrictions listed in Table 26 apply only to fastvector routines SIN, COS, and TAN. The standard vector routines handle the full range of VAX floatingpoint numbers.
If the application program uses arguments outside of the listed domain, the routine returns the following error message: %SYSTEMFVARITH, vector arithmetic fault If the application requires argument values beyond the listed limits, use the corresponding standard vector math routine. 2.4.3 AccuracyThe fastvector math routines do not guarantee the same results as those obtained with the corresponding standard vector math routines. Calls to the fastvector routines generally yield results that are different from the scalar and original vector MTH$ library routines. The typical maximum error is a 2LSB (Least Significant Bit) error for the F_floating routines and a 4LSB error for the D_floating and G_floating routines. This generally corresponds to a difference in the 6th significant decimal digit for the F_floating routines, the 15th digit for D_floating, and the 14th digit for G_floating. 2.4.4 PerformanceThe fastvector math routines generally provide performance improvements over the standard vector routines ranging from 15 to 300 percent, depending on the routines called and input arguments to the routines. The overall performance improvement using fastvector math routines in a typical user application will increase, but not at the same level as the routines themselves. You should do performance and correctness testing of your application using both the fastvector and the standard vector math routines before deciding which to use for your application.
