Table 31 Vector routines called by +Ovectorize (continued)

Vector routine

Description

saxpy

Add a scalar multiple of a vector to a vector, using single-precision operands.

 

 

sdot

Compute the dot product of two single-precision vectors.

 

 

sdot

Find the maximum absolute value in a double-precision vector.

 

 

vec_dmult_add

Multiply a scalar by a vector and add the result to the result vector, using

 

double-precision operands.

 

 

vec_dsum

Sum the elements of a double-precision vector.

If your PA2.0 application uses very large arrays, compiling with both +Ovectorizeand +Odataprefetchmay also increase performance. The math library contains special prefetching versions of the vector routines that are called if you specify both options.

If you compile with the +Ovectorizeand +Oinfooptions, the optimizer will identify which loops it vectorized. If you find that the extent of vectorization is not significant, you may want to consider some other optimization, such as parallelization.

Controlling vectorization locally

When you compile with the +Ovectorizeoption, the optimizer considers all loops in the source file as candidates for vectorization. The *$*[NO]VECTORIZEdirective enables you to limit vectorization. You use the *$* NOVECTORIZEform of the directive to disable vectorization and the *$* VECTORIZEform to enable it. The directive applies to the beginning of the next loop and remains in effect for the rest of the program unit or until superseded by a later directive. The directive is ignored if you do not compile with the +Ovectorizeoption and specify an optimization of 3 or higher.

For example, if a file containing the following code segment were compiled with +Ovectorize, only one loop would be considered as a candidate for vectorization:

!This is line 1 of the source file. !*$* NOVECTORIZE

.

.

.

!*$* VECTORIZE DO i = 1, 100

.

.

.

END DO

!*$* NOVECTORIZE

.

.

.

Note that the *$* VECTORIZEdirective does not force vectorization. The optimizer vectorizes only if:

The loop performs a vector operation recognized by the optimizer as in its repertoire.

The loop is safe to vectorize. The same conditions that can prevent parallelization—see, for example, “Data dependences” (page 102)—can also prevent vectorization.

The optimizer can discover no other transformations that can result in better performance.

The only way to ensure vectorization is for the programmer to edit the source file and substitute an appropriate call to the BLAS library for the loop, as described in “Controlling vectorization locally” (page 104).

104 Performance and optimization